Clusterwiki wikidb https://uhhpc.herts.ac.uk//wiki/index.php?title=Main_Page MediaWiki 1.34.0 first-letter Media Special Talk User User talk Clusterwiki Clusterwiki talk File File talk MediaWiki MediaWiki talk Template Template talk Help Help talk Category Category talk Main Page 0 1 1 2010-05-06T15:17:36Z MediaWiki default 0 wikitext text/x-wiki <big>'''MediaWiki has been successfully installed.'''</big> Consult the [http://meta.wikimedia.org/wiki/Help:Contents User's Guide] for information on using the wiki software. == Getting started == * [http://www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list] * [http://www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ] * [https://lists.wikimedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list] bd962048d95fbb6b6b514885867811db20a5476b 2 1 2010-05-06T15:20:28Z WikiSysop 1 wikitext text/x-wiki == Welcome to the cluster documentation wiki == == Getting started with MediaWiki== * How to use the software: [http://meta.wikimedia.org/wiki/Help:Contents User's Guide] * [http://www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list] * [http://www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ] * [https://lists.wikimedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list] 2e0f86b711880cf55c1714183c9ef4b4cf9c8a08 3 2 2010-05-06T15:51:50Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki will be the location for documentation for the cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. == Cluster basics == * [[Accounts]] * [[Access]] * [[Running jobs]] == Getting started with MediaWiki== * How to use the software: [http://meta.wikimedia.org/wiki/Help:Contents User's Guide] * [http://www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list] * [http://www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ] * [https://lists.wikimedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list] 6a3f9e53e7de69472eca6df1cead09aa4a86cac6 4 3 2010-05-06T15:54:20Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki will be the location for documentation for the cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. == Cluster basics == * [[Accounts]] * [[Access]] * [[Running jobs]] == More esoteric == * [[MediaWiki]] 255c70add088f942a0ba35a563a4b2b1df1b73f6 7 4 2010-05-06T16:07:21Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki will be the location for documentation for the cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Running jobs]] == More esoteric == * [[MediaWiki]] 2041e0a624c8006b8d4828ee81a2e71b1ceec93b 11 7 2010-05-06T16:25:44Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki will be the location for documentation for the cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Running jobs]] * [[Administrators]] contact details == More esoteric == * [[MediaWiki]] ee8ce7bf3a8c476783b1860740d2d4ccb76b4513 16 11 2010-05-06T16:32:44Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki will be the location for documentation for the cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Jobs]] * [[Administrators]] contact details == More esoteric == * [[MediaWiki]] 490166b68cf9306d6ac052ab0a0f29ab1310aab7 22 16 2010-05-06T16:46:36Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki will be the location for documentation for the cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Jobs]] * [[Administrators]]' contact details == More esoteric == * [[MediaWiki]] b8245a5aff8f5a437e219234ee0bf95301cad741 28 22 2010-05-06T18:58:02Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki will be the location for documentation for the cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] == More esoteric == * [[MediaWiki]] 601e85b3823b46132a87851ff368987ff89306dd 30 28 2010-05-06T19:23:37Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[MPI]] * [[Parallelization|How to parallelize your job]] 183338807182c701d85970732c800983ecbd7f5f 48 30 2010-05-09T11:57:05Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[MPI]] * [[Parallelization|How to parallelize your job]] 884b120e8cd5b100889bcec0711f1bd4771be274 MediaWiki 0 2 5 2010-05-06T15:54:41Z Mjh 2 Create page with old data wikitext text/x-wiki == Getting started with MediaWiki== * How to use the software: [http://meta.wikimedia.org/wiki/Help:Contents User's Guide] * [http://www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list] * [http://www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ] * [https://lists.wikimedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list] e4b16b14650bfe6e91f2e2b73541f565bfc03194 Accounts 0 3 6 2010-05-06T16:07:02Z Mjh 2 Created page with '== Accounts == To get an account, speak to John Atkinson in E117C. Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research…' wikitext text/x-wiki == Accounts == To get an account, speak to John Atkinson in E117C. Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research (CAR) * Members of the Centre for Atmospheric & Instrumentation Research * Members of the School of Computer Science * Others, by special arrangement. Access is granted subject to observance of our usage [[policies]]. 7cdf3522684ddac8764bba6d66ed04b76f9ba820 19 6 2010-05-06T16:37:24Z Mjh 2 wikitext text/x-wiki To get an account, speak to John Atkinson in E117C. Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research (CAR) * Members of the Centre for Atmospheric & Instrumentation Research * Members of the School of Computer Science * Others, by special arrangement. Access is granted subject to observance of our usage [[policies]]. e3c2f38b6d652146226b4cd64b711e6acb1a2be9 29 19 2010-05-06T19:20:18Z Mjh 2 wikitext text/x-wiki To get an account, speak to John Atkinson in E117C. Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research (CAR) * Members of the Centre for Atmospheric & Instrumentation Research (CAIR) * Members of the School of Computer Science (CS) * Others, by special arrangement; restricted to those who have made a financial contribution to the cluster. Access is granted subject to observance of our usage [[policies]]. e56ad6b08b401d7fe037da119409d8179c291d43 Policies 0 4 8 2010-05-06T16:08:47Z Mjh 2 Created page with '== Policies == The cluster is by design a shared resource. In using it you must be considerate of other users.' wikitext text/x-wiki == Policies == The cluster is by design a shared resource. In using it you must be considerate of other users. c56586e982b6c75c406abd6f933f265dfd36b4cd 20 8 2010-05-06T16:45:45Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must never be used for computation on a larger scale than a few minutes' testing. It is the file server for the whole cluster and all user logins have to pass through it. * The normal method of using the compute nodes is by way of the [[batch queuing system|jobs]]. You should not log in to the compute nodes directly unless you have been given specific authorization to do so (e.g. if you have got approval to dedicate one or more nodes to data reduction tasks). * When using the batch queuing system you must honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. 3041a06060bedadbd2965ffc8b96f57ffb865411 21 20 2010-05-06T16:46:10Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must never be used for computation on a larger scale than a few minutes' testing. It is the file server for the whole cluster and all user logins have to pass through it. * The normal method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You should not log in to the compute nodes directly unless you have been given specific authorization to do so (e.g. if you have got approval to dedicate one or more nodes to data reduction tasks). * When using the batch queuing system you must honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. 8767b2d0b1fb6776efe0818b36291d997e14f9fa 47 21 2010-05-09T11:56:16Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing. * The normal method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly unless you have been given specific authorization to do so (e.g. if you have got approval to dedicate one or more nodes to data reduction tasks). * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. 6c16bc37740e2a13d32cc48c944e21459318d693 Access 0 5 9 2010-05-06T16:24:00Z Mjh 2 Created page with '== Access == The [[architecture|head node]] of the cluster is accessible from within the university by ssh to stri-cluster.herts.ac.uk, once you have an [[accounts|account]] set…' wikitext text/x-wiki == Access == The [[architecture|head node]] of the cluster is accessible from within the university by ssh to stri-cluster.herts.ac.uk, once you have an [[accounts|account]] set up. Currently it is not possible to access the head node from outside the University network; if you need this please discuss with the [[administrators]]. 3c1e5c55469f4c9248d912129d0779ba0eed57e5 15 9 2010-05-06T16:32:18Z Mjh 2 wikitext text/x-wiki == Access == The [[architecture|head node]] of the cluster is accessible from within the university by ssh to stri-cluster.herts.ac.uk, once you have an [[accounts|account]] set up. Currently it is not possible to access the head node from outside the University network; if you need this please discuss with the [[administrators]]. Individual compute nodes must be accessed via the head node: see also the [[policies|policy]] relating to this. 070e8c519fda2bcc57cb9cb83b6e9b94ba1f302e Administrators 0 6 10 2010-05-06T16:24:57Z Mjh 2 Created page with '== Administrators == These are currently: * John Atkinson, j.atkinson@herts.ac.uk * Martin Hardcastle, m.j.hardcastle@herts.ac.uk Contact us with queries.' wikitext text/x-wiki == Administrators == These are currently: * John Atkinson, j.atkinson@herts.ac.uk * Martin Hardcastle, m.j.hardcastle@herts.ac.uk Contact us with queries. 3be9f73f8ea84108bdeefb64f1903aa75ef83e93 12 10 2010-05-06T16:26:31Z Mjh 2 wikitext text/x-wiki == Administrators == These are currently: * John Atkinson, j.atkinson@herts.ac.uk (x3358, room E117C) * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 1E116). Contact us with queries. 439d31c89a0099465f042c4c1c948f2ebf19f3cf 23 12 2010-05-06T16:47:27Z Mjh 2 wikitext text/x-wiki == Administrators == These are currently: * John Atkinson, j.atkinson@herts.ac.uk (x3358, room E117C) * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 1E116). Contact us with queries. [[File:m.jpg]] eafb89ab072cedfeab448de9a98ebdc7f04769ef 24 23 2010-05-06T16:47:44Z Mjh 2 wikitext text/x-wiki == Administrators == These are currently: * John Atkinson, j.atkinson@herts.ac.uk (x3358, room E117C) * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 1E116). Contact us with queries. 439d31c89a0099465f042c4c1c948f2ebf19f3cf Architecture 0 7 13 2010-05-06T16:30:42Z Mjh 2 Created page with '== Architecture == The cluster consists of * a head node, which is an 8-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons…' wikitext text/x-wiki == Architecture == The cluster consists of * a head node, which is an 8-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster * 110 Tb of storage attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. [[Networking]] details are described elsewhere. f552909bf1d8cb4b1ce26bb5d7f0acb121997a82 14 13 2010-05-06T16:31:19Z Mjh 2 wikitext text/x-wiki == Architecture == The cluster consists of * a head node, which is an 8-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. [[Networking]] details are described elsewhere. 38e0f4b999ec0b49a24fd1832da11d1490995921 18 14 2010-05-06T16:37:10Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 8-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. [[Networking]] details are described elsewhere. b3f4a3330c770947828c92330a28cc6981214c6c 27 18 2010-05-06T18:56:04Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 8-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. 0aa8d65123631849e292588e1521cf84288fb96c Storage 0 8 17 2010-05-06T16:36:50Z Mjh 2 Created page with 'The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 65 Tb of scratch available to all users, mounted as /stri-da…' wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 65 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large astronomical datasets will be being processed on the cluster. (See also [[policies]].) 4f3656586c0b7f9c2f39aa76084cfec8f0e19529 Jobs 0 9 25 2010-05-06T17:04:33Z Mjh 2 Created page with 'The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] and [htt…' wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. 24ade543a4db770d6f99a62868971e1dc0504e92 34 25 2010-05-07T14:23:25Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis (see [[networking]]). b72ae512f593f34adbabea9b64d5a4d2df39a9e2 35 34 2010-05-07T18:48:11Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You can only view your own jobs using qstat; however, the MAUI tool <tt>showq</tt> gives a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. 410c741570a279c971ff51ae59eaa1a2db48f2f4 36 35 2010-05-07T19:28:07Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You can only view your own jobs using qstat; however, the MAUI tool <tt>showq</tt> gives a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pdsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. 67c838a6d5c7ec067292c3096aba44989f7e54a0 37 36 2010-05-08T13:34:44Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) You can only view your own jobs using qstat; however, the MAUI tool <tt>showq</tt> gives a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pdsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. c0cf30f71fd946cbe4da1460c2d48a1a132a1918 38 37 2010-05-09T08:47:21Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) You can only view your own jobs using qstat; however, the MAUI tool <tt>showq</tt> gives a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> This command would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. 39b2c132c364b69a1f52dd95554de05f4cef4304 42 38 2010-05-09T11:07:48Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) You can only view your own jobs using qstat; however, the MAUI tool <tt>showq</tt> gives a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. 1f35afc5a24981313d6f0ce37a84f7f1c69d9a2c Networking 0 10 26 2010-05-06T18:53:40Z Mjh 2 Created page with 'The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet…' wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. Management traffic also uses this switch. There are in fact two infiniband networks: one for the main cluster, which is dual data-rate (nominal 2.5 Gbit/s) and one for the CAIR cluster, which is quad data-rate (5.0 Gbit/s). As for the ethernet, each chassis has an internal Infiniband switch. The two sub-clusters each have an external Infiniband switch which connects the chassis that make up the cluster and the head node is connected, separately, to both of those switches. Therefore there is no native infiniband connectivity between the two sub-clusters. IP over Infiniband can be used to connect between them, but this should not be relied on for large data volumes as it must be routed via the head node. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency and data transfer rates are somewhat higher between nodes in the same chassis than between different chassis in the same cluster, and ethernet connections between the two sub-clusters are higher-latency and lower-bandwidth still, while native Infiniband connections between the two sub-clusters are essentially impossible. Best results will be obtained running jobs within a single chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx or 4.xxx, where the 3 and 4 refer to the main and CAIR clusters respectively). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. f8d84ed07663bde5f5bcbdb5e983101974fab77f Clusterwiki:About 4 11 31 2010-05-06T19:24:23Z Mjh 2 Created page with 'This wiki documents the STRI cluster.' wikitext text/x-wiki This wiki documents the STRI cluster. 7b631f7f0c1d2c5642063355147d292172d16a3e 32 31 2010-05-06T19:25:02Z Mjh 2 wikitext text/x-wiki This wiki documents the STRI cluster. It uses [[Mediawiki]] and runs under Linux on the cluster head node. 9d32ed9b7e49666fd0bc9610a4efbaa2226173ee 33 32 2010-05-06T19:25:19Z Mjh 2 wikitext text/x-wiki This wiki documents the STRI cluster. It uses [[MediaWiki]] and runs under Linux on the cluster head node. 08be33541d11f18a681e80ec16541a2c39c7fbd7 MPI 0 12 39 2010-05-09T10:51:52Z Mjh 2 Created page with '== What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Pr…' wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. == MVAPICH2 == MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command mpi-selector-menu on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> MVAPICH2 integration with Torque is not as good as for MPICH2. A script <tt>/soft/bin/torque-mv</tt> 8961e4575964141945840229e3e2351b15c45ea5 40 39 2010-05-09T11:00:40Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. == MVAPICH2 == MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command mpi-selector-menu on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> MVAPICH2 integration with Torque is not as good as for MPICH2 (at present <tt>mpiexec</tt> does not work, though it's supposed to). A script <tt>/soft/bin/torque-mv</tt> has been provided as a partial replacement for <tt>mpiexec</tt>; this uses the ssh-based <tt>mpirun_rsh</tt> tool to start jobs on all appropriate machines. You will need [[passwordless ssh]] set up for this to work. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /soft/bin/torque-mv /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 4232a4110833461c79bad5e89960b3de46ef468c 41 40 2010-05-09T11:01:11Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command mpi-selector-menu on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> MVAPICH2 integration with Torque is not as good as for MPICH2 (at present <tt>mpiexec</tt> does not work, though it's supposed to). A script <tt>/soft/bin/torque-mv</tt> has been provided as a partial replacement for <tt>mpiexec</tt>; this uses the ssh-based <tt>mpirun_rsh</tt> tool to start jobs on all appropriate machines. You will need [[passwordless ssh]] set up for this to work. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /soft/bin/torque-mv /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 1fc6554883a752301c119627f1bd3ef678dd8a04 45 41 2010-05-09T11:31:23Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command mpi-selector-menu on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> MVAPICH2 integration with Torque is not as good as for MPICH2 (at present <tt>mpiexec</tt> does not work, though it's supposed to). A script <tt>/soft/bin/torque-mv</tt> has been provided as a partial replacement for <tt>mpiexec</tt>; this uses the ssh-based <tt>mpirun_rsh</tt> tool to start jobs on all appropriate machines. You will need [[passwordless ssh]] set up for this to work. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /soft/bin/torque-mv /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. f4169e7686c1a21e8ab9fc5b6e5eb25d8ba0be91 Passwordless ssh 0 13 43 2010-05-09T11:18:20Z Mjh 2 Created page with 'For some applications you will need to enable passwordless ssh between nodes. The simplest way of doing this is as follows: * run <tt>ssh-keygen</tt> and generate a key with no…' wikitext text/x-wiki For some applications you will need to enable passwordless ssh between nodes. The simplest way of doing this is as follows: * run <tt>ssh-keygen</tt> and generate a key with no passphrase (just press return when prompted). * cd into your <tt>~/.ssh</tt> directory. * <tt>cat id_rsa.pub >> authorized_keys</tt> * To get your <tt>known_hosts</tt> file filled up correctly, do <tt>pdsh -f 1 -w 'node[001-080]' hostname</tt>. The first time you do this, you should see a bunch of messages about files being added to <tt>known_hosts</tt>. If you then do it again, you should just see the hostnames of all the nodes appearing in order. * Passwordless ssh is now set up. 2279d37263a9aaaa565e0a9c116ed78f5d07173a 44 43 2010-05-09T11:28:29Z Mjh 2 wikitext text/x-wiki For some applications you will need to enable passwordless ssh between nodes. The simplest way of doing this is as follows: * run <tt>ssh-keygen</tt> and generate a key with no passphrase (just press return when prompted). * cd into your <tt>~/.ssh</tt> directory. * <tt>cat id_rsa.pub >> authorized_keys</tt> * Passwordless ssh is now set up: try e.g. <tt>ssh node001 hostname</tt> to test it. db879b0ce64cbfbd57cc94791948aeddf50113de Parallelization 0 14 46 2010-05-09T11:55:05Z Mjh 2 Created page with 'It is very important to realise that running a job on the cluster does not automatically give you access to all the CPU resources of the cluster (or even of the subset of the clu…' wikitext text/x-wiki It is very important to realise that running a job on the cluster does not automatically give you access to all the CPU resources of the cluster (or even of the subset of the cluster you request from the [[jobs|job control system]]). There is one, and only one, situation in which you can run normal, single-threaded code on the cluster and get an improvement in performance. That is the situation in which ''you can usefully run multiple copies of the same program simultaneously without any communication between the different copies''; in other words, you are only limited by how many CPUs you can get access to. Examples might include Monte Carlo simulation or some data analysis tasks. Problems of this kind are called ''embarrassingly parallel''. Provided that your code is thread-safe -- that is, it doesn't have implementation errors that prevent it being run several times simultaneously, such as use of temporary files with the same name --- you can use the cluster for this sort of problem without modifying your code. You may be able to use the job control system with commands such as <tt>pbsdsh</tt>, or you may need to request that a node be dedicated to your task. Once you need the different parts of your code to communicate, you in general ''must'' use code that is written specifically to do so. At present [[MPI]] is the method provided to do this. If you are planning to run an application written by a third party, it needs to use [[MPI]] if you expect to be able to run on more than one node. If you are running your own code, you will need to modify it, often very substantially, to allow parallel execution. '''This is your responsibility, not that of the cluster administrators.''' 7cc2fed98d405f0817c8269b1edebeed022d5080 Queues 0 15 49 2010-05-09T12:26:56Z Mjh 2 Created page with 'There are four possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on th…' wikitext text/x-wiki There are four possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * Finally 'all' submits to all 80 nodes, with a maximum wall time of 2 hours; note that MPI over Infiniband will not work if you request nodes spanning both the main and CAIR clusters. 1629bba541d9767351ce758bb610ccb8c85c4138 50 49 2010-05-09T12:27:13Z Mjh 2 wikitext text/x-wiki There are four possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * Finally 'all' submits to all 80 nodes, with a maximum wall time of 2 hours; note that MPI over Infiniband will not work if you request nodes spanning both the main and CAIR clusters. a8c0b319e35e9b61d71396298a24578ceeecca0a Jobs 0 9 51 42 2010-05-09T16:49:30Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. c9f53a41c61d606b569a8f75c912a9d06c5cdf4b 63 51 2010-05-20T15:15:15Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. 1149f7c7b5ea985959aa401f95ea77275d1ecb66 64 63 2010-05-28T15:03:17Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. 0e9b955029551149114f96d135f26e9b2e782e9a 65 64 2010-05-28T15:07:24Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> 906bd3c417e87d3d6f9f518164e5d4bad9afa7eb 68 65 2010-05-28T15:14:43Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> 48c1d3bd6e29f499b979bc0ef41a591acf4df5bc 83 68 2010-06-11T10:58:07Z Mjh 2 add multiple job info wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. 7ba8853148192b37f5d8de1e4203b19bd891593b 84 83 2010-06-11T11:00:14Z Mjh 2 /* Basic commands */ wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. bb406a0f142971181dd3709c8ca258b5e6d0d56e 86 84 2010-06-17T07:08:18Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. bb214ac8329c548df6ebb1da6e806df033530a4f Main Page 0 1 52 48 2010-05-14T13:58:51Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] 4a24fd4b4306aa294dfda7a047a996dcc7c32474 54 52 2010-05-17T10:47:38Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] 94eeae5af50a6e24c5dc201c3a32b8a2e46d8ee5 58 54 2010-05-17T13:50:44Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] 60e9f968a172950c4e2c40a1f35e607adac2ac9b 69 58 2010-05-30T07:41:55Z Mjh 2 /* Using the cluster */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] 3d8adda5c9b09716753401f77941e2b434256463 Compilers 0 16 53 2010-05-14T13:59:48Z Mjh 2 Created page with 'The standard C and Fortran compilers are <tt>gcc</tt> and <tt>gfortran</tt>. The Intel compilers <tt>icc</tt> and <tt>ifort</tt> are also available and may be superior for many a…' wikitext text/x-wiki The standard C and Fortran compilers are <tt>gcc</tt> and <tt>gfortran</tt>. The Intel compilers <tt>icc</tt> and <tt>ifort</tt> are also available and may be superior for many applications. 7c81ae718d3204ab98d6cc3b0751f6affbde75a4 Software 0 17 55 2010-05-17T10:53:48Z Mjh 2 Created page with 'This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * gromacs: 4.0.7 installed in <tt>/soft/groma…' wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * gromacs: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock Vina: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86 * Autodock: 4.2 installed in <tt>/soft/autodock</tt> 0b771b81b012748ea7e9647d2350f35b8d8ed34d 56 55 2010-05-17T10:53:58Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * gromacs: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock Vina: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * Autodock: 4.2 installed in <tt>/soft/autodock</tt> fe3ef45623b85fa43e8b1575d5778da02d82d1b1 57 56 2010-05-17T11:12:36Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * gromacs: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock Vina: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * Autodock: 4.2 installed in <tt>/soft/autodock</tt> * iGemDock: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> 457ad71a3e4c7591a5f6bd246ea428ffc2ea3dc5 79 57 2010-06-02T09:25:30Z Akukol 3 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * [[Gromacs]]: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock Vina: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * Autodock: 4.2 installed in <tt>/soft/autodock</tt> * iGemDock: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> fa8b545852c44b542e221afcd777279f23362bb2 80 79 2010-06-02T09:31:12Z Akukol 3 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]<\u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock Vina: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * Autodock: 4.2 installed in <tt>/soft/autodock</tt> * iGemDock: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> 8659943ecf8b773bae944bb2505e9dffabb0117e 81 80 2010-06-02T09:31:33Z Akukol 3 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock Vina: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * Autodock: 4.2 installed in <tt>/soft/autodock</tt> * iGemDock: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> 7f78613aad13b44a2fbde9dde9d84105e3591458 MPI 0 12 59 45 2010-05-18T11:27:58Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command mpi-selector-menu on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> MVAPICH2 integration with Torque is not as good as for MPICH2 (at present <tt>mpiexec</tt> does not work, though it's supposed to). A script <tt>/soft/bin/torque-mv</tt> has been provided as a partial replacement for <tt>mpiexec</tt>; this uses the ssh-based <tt>mpirun_rsh</tt> tool to start jobs on all appropriate machines. You will need [[passwordless ssh]] set up for this to work. <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /soft/bin/torque-mv /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 2fdc78229b8ba6c14f2ab6eb0488f9d90c653d29 60 59 2010-05-18T12:54:41Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command mpi-selector-menu on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> MVAPICH2 integration with Torque is not as good as for MPICH2 (at present <tt>mpiexec</tt> does not work, though it's supposed to). A script <tt>/soft/bin/torque-mv</tt> has been provided as a partial replacement for <tt>mpiexec</tt>; this uses the ssh-based <tt>mpirun_rsh</tt> tool to start jobs on all appropriate machines. You will need [[passwordless ssh]] set up for this to work. <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /soft/bin/torque-mv /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. bc30c85ef90578024715a565be2c90bdc36e7538 Architecture 0 7 61 27 2010-05-19T15:21:12Z WikiSysop 1 wikitext text/x-wiki The cluster consists of * a head node, which is an 8-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 26fa825472372d8a5b67081c276dd53f73e16df5 62 61 2010-05-20T15:11:27Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 8-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 7b66f021aeee6413bca6858eef7680da2c0a6390 Passwordless ssh 0 13 66 44 2010-05-28T15:13:11Z Mjh 2 wikitext text/x-wiki For some applications (including use of the job submission system) you will need to enable passwordless ssh between nodes. The simplest way of doing this is as follows: * run <tt>ssh-keygen</tt> and generate a key with no passphrase (just press return when prompted). * cd into your <tt>~/.ssh</tt> directory. * <tt>cat id_rsa.pub >> authorized_keys</tt> * Passwordless ssh is now set up: try e.g. <tt>ssh node001 hostname</tt> to test it. 72fd93ffcb7a5e2021fea019daf8a5ef78c980fb 67 66 2010-05-28T15:13:37Z Mjh 2 wikitext text/x-wiki For some applications (including use of the [[jobs|job submission system]]) you will need to enable passwordless ssh between nodes. The simplest way of doing this is as follows: * run <tt>ssh-keygen</tt> and generate a key with no passphrase (just press return when prompted). * cd into your <tt>~/.ssh</tt> directory. * <tt>cat id_rsa.pub >> authorized_keys</tt> * Passwordless ssh is now set up: try e.g. <tt>ssh node001 hostname</tt> to test it. a72bcb9c5d294396d9365461bda127b1c51e5c93 71 67 2010-06-01T09:08:13Z Mjh 2 wikitext text/x-wiki For some applications (including use of the [[jobs|job submission system]]) you will need to enable passwordless ssh between nodes. The simplest way of doing this is as follows: * run <tt>ssh-keygen</tt> and generate a key with no passphrase (just press return when prompted). * cd into your <tt>~/.ssh</tt> directory. * <tt>cat id_rsa.pub >> authorized_keys</tt> * Passwordless ssh is now set up: try e.g. <tt>ssh node001 hostname</tt> to test it. Note that you are ''not'' permitted to use this to run jobs on the nodes: see [[Policies]] for more. bb4735e1cb4955873784c3e577f22f337a7ec656 Mail 0 18 70 2010-05-30T07:45:48Z Mjh 2 Created page with 'Various systems on the cluster will want to send you e-mail. By default this will be delivered to a local mailbox: you will need to read it with a command-line mail tool like <tt…' wikitext text/x-wiki Various systems on the cluster will want to send you e-mail. By default this will be delivered to a local mailbox: you will need to read it with a command-line mail tool like <tt>mutt</tt> on the head node. You are advised to set up a <tt>.forward</tt> file which will send it to your normal inbox. To do this, simply create the file containing one line of text, the e-mail address to forward to: <pre> cat <<END >.forward f.bloggs@herts.ac.uk END </pre> Please don't allow your inbox on the cluster to fill up with large messages. d3cb935f23039bce1ac1d56bd2b0dcd4b2c39b43 Gromacs 0 19 72 2010-06-02T09:11:35Z Akukol 3 Run Gromacs wikitext text/x-wiki == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each nodes has 8 CPU cores), the working directory and the details of the mdrun command. -------------- #!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 0cb22746efc497b15d1f65b26f41c7e970e885ad 73 72 2010-06-02T09:16:24Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each nodes has 8 CPU cores), the working directory and the details of the mdrun command. -------------- &#!/bin/sh &#PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 5260f556b2a44c46c12e244be839b0714d616b09 74 73 2010-06-02T09:17:39Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each nodes has 8 CPU cores), the working directory and the details of the mdrun command. -------------- <nowiki>#!/bin/sh <nowiki>#PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 9a66022d9cadcb0af31471a5dcacda2ab8a35c15 75 74 2010-06-02T09:18:14Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each nodes has 8 CPU cores), the working directory and the details of the mdrun command. -------------- <nowiki> #!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 2c856e437dd24cb0503971f5e0a8727e4831a97b 76 75 2010-06-02T09:19:20Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each nodes has 8 CPU cores), the working directory and the details of the mdrun command. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 5097f80631b90487df0a4a78ff1cbf0a2098c660 77 76 2010-06-02T09:20:40Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the working directory and the details of the mdrun command. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' ab4e045557e0d2fde183acfba81b19c863eedafd 78 77 2010-06-02T09:21:32Z Akukol 3 gromacs wikitext text/x-wiki == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the working directory and the details of the mdrun command. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 6cd16cdb43ca951d1d94ac04c4918bd04dfa19fa 82 78 2010-06-02T09:38:45Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki Gromacs is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the working directory and the details of the mdrun command. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -j oe #PBS -u akukol # runs a job with name GromacsTest on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # merge 'output' and 'standard error' and output both to 'standard output' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 648e371a84eb3e8678d696f5c8e8f87807042058 88 82 2010-06-17T08:34:36Z Akukol 3 wikitext text/x-wiki Gromacs is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[jobs:Here are all the options explained.|Here are all the options explained.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default [[Queues:walltime|walltime]] on the 'main' queue is 24 hours. If you don't specify any walltime, the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 45ed455d237e8daa2165784d2acfd0aa3fa4b37b 89 88 2010-06-17T08:35:20Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki Gromacs is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs:Here are all the options explained.|Here are all the options explained.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default [[Queues:walltime|walltime]] on the 'main' queue is 24 hours. If you don't specify any walltime, the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' d9c62ebc1848c9b3abaa4e6834b06fcd95c5bdbc 90 89 2010-06-17T08:37:37Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki Gromacs is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. More explanations: [[Jobs]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default [[walltime]] on the 'main' queue is 24 hours. If you don't specify any walltime, the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 2a4e69b1e9052372c0dd001aa4605b7f93e11302 91 90 2010-06-17T08:39:21Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki Gromacs is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. More explanations: [[Jobs]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any walltime, the job will stop after 24 hours. More info about walltimes: [[Queues]] -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 37f274a3a5ecb7abcb21ba75a22bcf155c05e3fb 92 91 2010-06-17T08:41:36Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki Gromacs is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. More explanations: [[Jobs|More explanations.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any walltime, the job will stop after 24 hours. More info about walltimes: [[Queues]] -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' a57cd6222dc99204d7cc11d5cbf0bc37b29fcce4 93 92 2010-06-17T08:42:53Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki Gromacs is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 4582024133ceefbcfa4c5d94ff64afdb65d8614c 94 93 2010-06-17T08:47:33Z Akukol 3 wikitext text/x-wiki [http://www.gromacs.org/ Gromacs]is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' ec9848b95954370827db4a358b45c49dfa3a3ab3 95 94 2010-06-17T08:47:44Z Akukol 3 wikitext text/x-wiki [http://www.gromacs.org/ Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' e5b3df18473a441ceb762a1e034757d592928b07 96 95 2010-06-17T08:49:40Z Akukol 3 wikitext text/x-wiki [*http://www.gromacs.org/ Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' a013927385dff87a6fd36361b1d3df3eb978596a 97 96 2010-06-17T08:52:06Z Akukol 3 wikitext text/x-wiki <a href="http://www.gromacs.org" target="_blank">Gromacs</a> is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 34556be0c275cfc87bea737f60b1f15823b2a15d 98 97 2010-06-17T08:53:02Z Akukol 3 wikitext text/x-wiki [[http://www.gromacs.org Gromacs]] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' 555168f1e46353e4d9b19cb9ea7e9cba13e9dba6 99 98 2010-06-17T08:53:13Z Akukol 3 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' cad460a95ec1b44a746cefc4c690a2f80eecad25 100 99 2010-06-17T09:24:15Z Akukol 3 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. Look here for [[groperform|optimising performance]]. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' a83332b470d0d51f119cd67a408d30d106530ced Queues 0 15 85 50 2010-06-17T07:06:59Z Mjh 2 wikitext text/x-wiki There are four possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * Finally 'all' submits to all 80 nodes, with a maximum wall time of 2 hours; note that MPI over Infiniband will not work if you request nodes spanning both the main and CAIR clusters. == Default wall times == The default wall time limitation for the 'all' and 'cair_s' queues is also the maximum, i.e. 6 hours and 2 hours respectively. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. 718dcd07c4345d1e8e241ad27decd2e560e0349e 87 85 2010-06-17T07:19:45Z Mjh 2 wikitext text/x-wiki There are four possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * Finally 'all' submits to all 80 nodes, with a maximum wall time of 2 hours; note that MPI over Infiniband will not work if you request nodes spanning both the main and CAIR clusters. This queue should only be used in unusual circumstances -- normally it will be better to use one of the others. == Default wall times == The default wall time limitation for the 'all' and 'cair_s' queues is also the maximum, i.e. 6 hours and 2 hours respectively. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. 4cca22adbe2ec2a3d819b36044cbc5fe396bee3b Groperform 0 20 101 2010-06-17T09:45:18Z Akukol 3 Created page with '== '''How to optimise the performance of Gromacs on the cluster''' == 1) Perform a short simulation (100 ps) using the same number of nodes as you would for the long simulation.…' wikitext text/x-wiki == '''How to optimise the performance of Gromacs on the cluster''' == 1) Perform a short simulation (100 ps) using the same number of nodes as you would for the long simulation. 2) Analyse the output of mdrun: For example ----- Will use 10 particle-particle and 6 PME only nodes This is a guess, check the performance at the end of the log file .. .. Average load imbalance: 17.3 % Part of the total run time spent waiting due to load imbalance: 8.3 % Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 9 % Average PME mesh/force load: 1.370 Part of the total run time spent waiting due to PP/PME imbalance: 14.6 % ----- 3) If you use more than 12 cores, you should optimise the number of PME only nodes/cores, by using the -npme option of mdrun. For the example above try '-npme 8'. The number of PME only nodes cannot be larger than half the total number of nodes. 4) If the energy file is not required for further analysis, the option -nosum can be used. Note that all above options are mainly relevant, when using more than one node (more than 8 cores) in order to optimise the comunication between nodes. 9bf13da0acde56145ebbf5d5de4ff53205b329b1 102 101 2010-06-17T09:48:33Z Akukol 3 /* How to optimise the performance of Gromacs on the cluster */ wikitext text/x-wiki == '''How to optimise the performance of Gromacs on the cluster''' == 1) Perform a short simulation (100 ps) using the same number of nodes as you would for the long simulation. 2) Analyse the output of mdrun: For example <pre> Will use 10 particle-particle and 6 PME only nodes This is a guess, check the performance at the end of the log file ... ... Average load imbalance: 17.3 % Part of the total run time spent waiting due to load imbalance: 8.3 % Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 9 % Average PME mesh/force load: 1.370 Part of the total run time spent waiting due to PP/PME imbalance: 14.6 % <\pre> 3) If you use more than 12 cores, you should optimise the number of PME only nodes/cores, by using the -npme option of mdrun. For the example above try '-npme 8'. The number of PME only nodes cannot be larger than half the total number of nodes. 4) If the energy file is not required for further analysis, the option -nosum can be used. Note that all above options are mainly relevant, when using more than one node (more than 8 cores) in order to optimise the comunication between nodes. 291bd607ffa95ff0b2ec1f5995968ab0656b4f48 103 102 2010-06-17T09:50:17Z Akukol 3 /* How to optimise the performance of Gromacs on the cluster */ wikitext text/x-wiki == '''How to optimise the performance of Gromacs on the cluster''' == 1) Perform a short simulation (100 ps) using the same number of nodes as you would for the long simulation. 2) Analyse the output of mdrun: For example <pre> Will use 10 particle-particle and 6 PME only nodes This is a guess, check the performance at the end of the log file ... ... Average load imbalance: 17.3 % Part of the total run time spent waiting due to load imbalance: 8.3 % Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 9 % Average PME mesh/force load: 1.370 Part of the total run time spent waiting due to PP/PME imbalance: 14.6 % </pre> 3) If you use more than 12 cores, you should optimise the number of PME only nodes/cores, by using the -npme option of mdrun. For the example above try '-npme 8'. The number of PME only nodes cannot be larger than half the total number of nodes. 4) If the energy file is not required for further analysis, the option -nosum can be used. Note that all above options are mainly relevant, when using more than one node (more than 8 cores) in order to optimise the comunication between nodes. 8f430c9fbb311dfec9cfec55ff578293d3ba32cb 104 103 2010-06-17T09:51:13Z Akukol 3 wikitext text/x-wiki == '''How to optimise the performance of Gromacs on the cluster''' == 1) Perform a short simulation (100 ps) using the same number of nodes as you would for the long simulation. 2) Analyse the output of mdrun: For example <pre> Will use 10 particle-particle and 6 PME only nodes This is a guess, check the performance at the end of the log file ... ... Average load imbalance: 17.3 % Part of the total run time spent waiting due to load imbalance: 8.3 % Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 9 % Average PME mesh/force load: 1.370 Part of the total run time spent waiting due to PP/PME imbalance: 14.6 % </pre> 3) If you use more than 12 cores, you should optimise the number of PME only nodes/cores, by using the -npme option of mdrun. For the example above try '-npme 8'. The number of PME only cores cannot be larger than half the total number of cores. 4) If the energy file is not required for further analysis, the option -nosum can be used. Note that all above options are mainly relevant, when using more than one node (more than 8 cores) in order to optimise the comunication between nodes. 4ddbfacf70de9af0bd10be688e4c154e268df62e Gromacs 0 19 105 100 2010-06-17T10:07:12Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. Look here for [[groperform|optimising performance]]. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs/bin/GMXRC export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] 7250620a34cc962a37fc0b1548ca3bacb299df34 Accounts 0 3 106 29 2010-06-17T11:37:47Z Mjh 2 wikitext text/x-wiki To get an account, speak to John Atkinson in E117C. Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research (CAR) * Members of the Centre for Atmospheric & Instrumentation Research (CAIR) * Other research-active members of the School of Physics, Astronomy and Mathematics (PAM) * Members of the School of Computer Science (CS) * Others, by special arrangement; restricted to those who have made a financial contribution to the cluster. Access is granted subject to observance of our usage [[policies]]. c53f75292ee04ffd37baaf6903af37b43f7f56ff Software 0 17 107 81 2010-06-18T09:24:39Z Akukol 3 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock Vina: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> 3806f00214c69ae60ba0d54caa964f377c7a600a 113 107 2010-06-18T09:45:01Z Akukol 3 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock [Vina]: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> ff0b9f37484eb9c6f4e853931b59a9a38a0d2313 114 113 2010-06-18T09:45:30Z Akukol 3 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> 2b74f156f1a0fe4d84db916950c3c7c27a2605ae IGemDock 0 21 108 2010-06-18T09:28:12Z Akukol 3 Created page with 'IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. It is particular suited for virtual screening using many processors.' wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. It is particular suited for virtual screening using many processors. bcc4992fadfd990d5aa12aaebaecef93f0757073 109 108 2010-06-18T09:30:02Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. It is particular suited for virtual screening using many processors. Start with /soft/iGEMDOCKv2.1-centos/bin/iGemdock The molecular docking engine is /soft/iGEMDOCKv2.1-centos/bin/mod_ga 1dfde735fc4dcc33c3ef023a825a5d9abc9b0d89 117 109 2010-06-18T10:16:50Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. It is particular suited for virtual screening using many processors. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .bashrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' 0ac48ab6fa1213b3e35c117c927c7d4248742aea 118 117 2010-06-23T15:56:02Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .bashrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' Gemdock runs one process only (on one CPU core). d22a875683384edea9bbb2130a0dac8a626e9490 119 118 2010-06-23T15:56:29Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .cshrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' Gemdock runs one process only (on one CPU core). cd637363f6323e7432e85196427a719bc5fbf531 130 119 2010-07-02T13:50:26Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .cshrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' Gemdock runs one process only (on one CPU core). <pre>#!/bin/sh #PBS -N GemD_comt2 #PBS -q main #PBS -l nodes=1:ppn=1 #PBS -j oe #PBS -u akukol #PBS -l walltime=250:00:00 export LD_LIBRARY_PATH /soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH cd /home/akukol/data/vscreenTest/comt2_gemdock ### This is the command ### /usr/local/bin/mpiexec /soft/iGEMDOCKv2.1-centos/bin/mod_ga -f docking.dock ### command end ### # start with 'qsub RunGemdock.sh' <\pre> 2ef81b03615686a4afd930fedd628eca6511f7f7 131 130 2010-07-02T13:50:38Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .cshrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' Gemdock runs one process only (on one CPU core). <pre>#!/bin/sh #PBS -N GemD_comt2 #PBS -q main #PBS -l nodes=1:ppn=1 #PBS -j oe #PBS -u akukol #PBS -l walltime=250:00:00 export LD_LIBRARY_PATH /soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH cd /home/akukol/data/vscreenTest/comt2_gemdock ### This is the command ### /usr/local/bin/mpiexec /soft/iGEMDOCKv2.1-centos/bin/mod_ga -f docking.dock ### command end ### # start with 'qsub RunGemdock.sh' </pre> a96a7276a8bcfeb7ded719a1f02144b5b898303e 132 131 2010-07-02T13:52:02Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .cshrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' Gemdock runs one process only (on one CPU core). That is the script RunGemdock.sh you need (remember to make RanGemdock.sh executable): <pre>#!/bin/sh #PBS -N GemD_comt2 #PBS -q main #PBS -l nodes=1:ppn=1 #PBS -j oe #PBS -u akukol #PBS -l walltime=250:00:00 export LD_LIBRARY_PATH /soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH cd /home/akukol/data/vscreenTest/comt2_gemdock ### This is the command ### /usr/local/bin/mpiexec /soft/iGEMDOCKv2.1-centos/bin/mod_ga -f docking.dock ### command end ### # start with 'qsub RunGemdock.sh' </pre> b7a6f42451794469ae2cb23ba799c6201d6db426 133 132 2010-07-02T13:52:30Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .cshrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' Gemdock runs one process only (on one CPU core). That is the script RunGemdock.sh you need (remember to make RunGemdock.sh executable): <pre>#!/bin/sh #PBS -N GemD_comt2 #PBS -q main #PBS -l nodes=1:ppn=1 #PBS -j oe #PBS -u akukol #PBS -l walltime=250:00:00 export LD_LIBRARY_PATH /soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH cd /home/akukol/data/vscreenTest/comt2_gemdock ### This is the command ### /usr/local/bin/mpiexec /soft/iGEMDOCKv2.1-centos/bin/mod_ga -f docking.dock ### command end ### # start with 'qsub RunGemdock.sh' </pre> f03765c2a432518d8fdd4f5cb4488470298f0ffb Autodock 0 22 110 2010-06-18T09:41:08Z Akukol 3 Created page with '[http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads Auto…' wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. akukol/bin/MGLtools. You need to download the file mgltools_x86_64Linux2_1.5.4.tar.gz. The automatic installer does not work. 82021f4b4b0d4c48f0fde9298677d72320299196 111 110 2010-06-18T09:41:44Z Akukol 3 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. 03ad99376c68e062700dd0023d6964603e8d2de1 112 111 2010-06-18T09:44:13Z Akukol 3 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. For virtual screening obtain the software [http://autodock.scripps.edu/resources/raccoon Raccoon]. This needs AutoDock tools to be installed first. 1b7b7bf69703566df7feaea2f9f7f1427a87d3f3 120 112 2010-06-23T16:07:27Z Akukol 3 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. For virtual screening obtain the software [http://autodock.scripps.edu/resources/raccoon Raccoon]. This needs AutoDock tools to be installed first. Raccoon automatically generates the file vs_submit.sh. This must be edited according to your special circumstances (see below) and then started with: 'nohup vs_submit.sh &' (do not use qsub) <pre>#!/bin/bash # # Generated with Raccoon | AutoDockVS # #### PBS jobs parametersCPUT="00:20:00" WALLT="00:20:00" # << change here # # There should be no reason # for changing the following values NODES=1 PPN=1 MEM=512mb ### CUSTOM VARIABLES # # use the following line to set special options (e.g. specific queues) #OPT="-q MyPriorQueue" OPT="" # Paths for executables on the cluster # Modify them to specify custom executables to be used QSUB="qsub" # << change here AUTODOCK="/soft/autodock/autodock4" # << change here # Special path to move into before running # the screening. This is very system-specific, # so unless you're know what are you doing, # leave it as it is WORKING_PATH=`pwd` ################################################################################## ################################################################################## ####### There should be no need to modify anything below this line ############################### ################################################################################## ################################################################################## # # type $AUTODOCK &> /dev/null || { echo -e "\nError: the file [$AUTODOCK] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the AutoDock binary in the script"; echo -e "( i.e. AUTODOCK=/usr/bin/autodock4 )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } type $QSUB &> /dev/null || { echo -e "\nError: the file [$QSUB] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the Qsub command binary in the script"; echo -e "( i.e. QSUB=/usr/bin/qsub )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } echo Starting submission... for NAME in `cat jobs_list` do cd $NAME echo "#!/bin/bash" > $NAME.job echo "cd $WORKING_PATH/$NAME" >> $NAME.job echo "$AUTODOCK -p $NAME.dpf -l $NAME.dlg" >> $NAME.job chmod +x $NAME.job echo -n "Submitting $NAME : " $QSUB $OPT -l cput=$CPUT -l nodes=1:ppn=1 -l walltime=$WALLT -l mem=$MEM $NAME.job sleep 23 # << add this line to avoid flooding the cluster with 1000nds of jobs cd .. done <\pre> The wait time of 23 seconds may be reduced in order to speed up the calculation. 08b1911982264aee959cac6b677dd4b28df6b602 121 120 2010-06-23T16:08:21Z Akukol 3 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. For virtual screening obtain the software [http://autodock.scripps.edu/resources/raccoon Raccoon]. This needs AutoDock tools to be installed first. Raccoon automatically generates the file vs_submit.sh. This must be edited according to your special circumstances (see below) and then started with: 'nohup vs_submit.sh &' (do not use qsub) <pre>#!/bin/bash # # Generated with Raccoon | AutoDockVS # #### PBS jobs parametersCPUT="00:20:00" WALLT="00:20:00" # << change here # # There should be no reason # for changing the following values NODES=1 PPN=1 MEM=512mb ### CUSTOM VARIABLES # # use the following line to set special options (e.g. specific queues) #OPT="-q MyPriorQueue" OPT="" # Paths for executables on the cluster # Modify them to specify custom executables to be used QSUB="qsub" # << change here AUTODOCK="/soft/autodock/autodock4" # << change here # Special path to move into before running # the screening. This is very system-specific, # so unless you're know what are you doing, # leave it as it is WORKING_PATH=`pwd` ################################################################################## ################################################################################## ####### There should be no need to modify anything below this line ############################### ################################################################################## ################################################################################## # # type $AUTODOCK &> /dev/null || { echo -e "\nError: the file [$AUTODOCK] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the AutoDock binary in the script"; echo -e "( i.e. AUTODOCK=/usr/bin/autodock4 )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } type $QSUB &> /dev/null || { echo -e "\nError: the file [$QSUB] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the Qsub command binary in the script"; echo -e "( i.e. QSUB=/usr/bin/qsub )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } echo Starting submission... for NAME in `cat jobs_list` do cd $NAME echo "#!/bin/bash" > $NAME.job echo "cd $WORKING_PATH/$NAME" >> $NAME.job echo "$AUTODOCK -p $NAME.dpf -l $NAME.dlg" >> $NAME.job chmod +x $NAME.job echo -n "Submitting $NAME : " $QSUB $OPT -l cput=$CPUT -l nodes=1:ppn=1 -l walltime=$WALLT -l mem=$MEM $NAME.job sleep 23 # << add this line to avoid flooding the cluster with 1000nds of jobs cd .. done </pre> The wait time of 23 seconds may be reduced in order to speed up the calculation. 1ba75e8cea04be72dcac1738bbde9e429d826c19 148 121 2010-07-22T14:22:15Z Akukol 3 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. For virtual screening obtain the software [http://autodock.scripps.edu/resources/raccoon Raccoon]. This needs AutoDock tools to be installed first. Raccoon automatically generates the file vs_submit.sh. This must be edited according to your special circumstances (see below) and then started with: 'nohup vs_submit.sh &' (do not use qsub) <pre>#!/bin/bash # # Generated with Raccoon | AutoDockVS # #### PBS jobs parametersCPUT="00:20:00" WALLT="00:20:00" # << change here # # There should be no reason # for changing the following values NODES=1 PPN=1 MEM=512mb ### CUSTOM VARIABLES # # use the following line to set special options (e.g. specific queues) #OPT="-q MyPriorQueue" OPT="-j oe" # join output and error # Paths for executables on the cluster # Modify them to specify custom executables to be used QSUB="qsub" # << change here AUTODOCK="/soft/autodock/autodock4" # << change here # Special path to move into before running # the screening. This is very system-specific, # so unless you're know what are you doing, # leave it as it is WORKING_PATH=`pwd` ################################################################################## ################################################################################## ####### There should be no need to modify anything below this line ############################### ################################################################################## ################################################################################## # # type $AUTODOCK &> /dev/null || { echo -e "\nError: the file [$AUTODOCK] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the AutoDock binary in the script"; echo -e "( i.e. AUTODOCK=/usr/bin/autodock4 )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } type $QSUB &> /dev/null || { echo -e "\nError: the file [$QSUB] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the Qsub command binary in the script"; echo -e "( i.e. QSUB=/usr/bin/qsub )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } echo Starting submission... for NAME in `cat jobs_list` do cd $NAME echo "#!/bin/bash" > $NAME.job echo "cd $WORKING_PATH/$NAME" >> $NAME.job echo "$AUTODOCK -p $NAME.dpf -l $NAME.dlg" >> $NAME.job chmod +x $NAME.job echo -n "Submitting $NAME : " $QSUB $OPT -l cput=$CPUT -l nodes=1:ppn=1 -l walltime=$WALLT -l mem=$MEM $NAME.job sleep 23 # << add this line to avoid flooding the cluster with 1000nds of jobs cd .. done </pre> The wait time of 23 seconds may be reduced in order to speed up the calculation. c369ebb49eea40e519cdf45145616e90d060bd3b 149 148 2010-07-22T14:25:28Z Akukol 3 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. For virtual screening obtain the software [http://autodock.scripps.edu/resources/raccoon Raccoon]. This needs AutoDock tools to be installed first. Raccoon automatically generates the file vs_submit.sh. This must be edited according to your special circumstances (see below) and then started with: 'nohup vs_submit.sh &' (do not use qsub) <pre>#!/bin/bash # # Generated with Raccoon | AutoDockVS # #### PBS jobs parametersCPUT="00:20:00" WALLT="00:20:00" # << change here # # There should be no reason # for changing the following values NODES=1 PPN=1 MEM=512mb ### CUSTOM VARIABLES # # use the following line to set special options (e.g. specific queues) #OPT="-q MyPriorQueue" OPT="-j oe -N AutoDock" # join output and error, job name: Autodock # Paths for executables on the cluster # Modify them to specify custom executables to be used QSUB="qsub" # << change here AUTODOCK="/soft/autodock/autodock4" # << change here # Special path to move into before running # the screening. This is very system-specific, # so unless you're know what are you doing, # leave it as it is WORKING_PATH=`pwd` ################################################################################## ################################################################################## ####### There should be no need to modify anything below this line ############################### ################################################################################## ################################################################################## # # type $AUTODOCK &> /dev/null || { echo -e "\nError: the file [$AUTODOCK] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the AutoDock binary in the script"; echo -e "( i.e. AUTODOCK=/usr/bin/autodock4 )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } type $QSUB &> /dev/null || { echo -e "\nError: the file [$QSUB] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the Qsub command binary in the script"; echo -e "( i.e. QSUB=/usr/bin/qsub )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } echo Starting submission... for NAME in `cat jobs_list` do cd $NAME echo "#!/bin/bash" > $NAME.job echo "cd $WORKING_PATH/$NAME" >> $NAME.job echo "$AUTODOCK -p $NAME.dpf -l $NAME.dlg" >> $NAME.job chmod +x $NAME.job echo -n "Submitting $NAME : " $QSUB $OPT -l cput=$CPUT -l nodes=1:ppn=1 -l walltime=$WALLT -l mem=$MEM $NAME.job sleep 23 # << add this line to avoid flooding the cluster with 1000nds of jobs cd .. done </pre> The wait time of 23 seconds may be reduced in order to speed up the calculation. 2f099ce00e9c2a490ae11f1d92fd6c9a6e1cf18d Vina 0 23 115 2010-06-18T09:48:33Z Akukol 3 Created page with '[http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The mole…' wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[AutoDock]]). c7e261f37833a5296c8d87826dc7902021eb7cff 116 115 2010-06-18T09:49:18Z Akukol 3 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). 720ff398c5065c24800b8e6f00cbbd3069206cf4 122 116 2010-06-23T16:13:54Z Akukol 3 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinScreen.bash &' (not qsub). <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N VinaScreen -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> 406c0c5f3294ec251305b34ae9b1059b1ae57c88 123 122 2010-06-23T16:14:47Z Akukol 3 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinScreen.bash &' (not qsub). <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N VinaScreen -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> c63b833f47732a36c4362ab783e9327f5bf50f85 124 123 2010-06-23T16:15:17Z Akukol 3 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinaScreen.bash &' (not qsub). <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N VinaScreen -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> 0df39ba0b810d567af94baad014afdb936372bd6 128 124 2010-07-02T13:45:57Z Akukol 3 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinaScreen.bash > nohup.out &' (not qsub). <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N VinaScreen -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> 2bdb7e4feeefb941a39372e35433dc482e9ed625 129 128 2010-07-02T13:46:51Z Akukol 3 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinaScreen.bash > nohup.out &' (not qsub). <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N $f -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> 221257bc4c7f7c214da6d54c5ff7563608c0d202 Architecture 0 7 125 62 2010-06-24T11:06:28Z Akukol 3 wikitext text/x-wiki The cluster consists of * a head node, which is an 16-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 45bacc25467e6c1513b75a50f1b1636eb46bd1fa 135 125 2010-07-13T06:26:21Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 16-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 8-core Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 8-core Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores, 256 Mb RAM and QDR Infiniband (smp1 and smp2). * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 0772ed16211bc52a9d115f6a078bcf175bb803e7 145 135 2010-07-21T12:33:04Z Cjoslin 5 Not 8 cores wikitext text/x-wiki The cluster consists of * a head node, which is an 16-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 1 socket x 4-core x 2 Hyperthreads Xeons with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 1 socket x 4-core x 2 Hyperthreads Xeons with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores, 256 Mb RAM and QDR Infiniband (smp1 and smp2). * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 57b1ca77678cd32d56c5ea411fedf85f5e52adb5 146 145 2010-07-21T12:35:14Z Cjoslin 5 wikitext text/x-wiki The cluster consists of * a head node, which is an 16-core Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons 1 socket x 4-core x 2 Hyperthreads with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons 1 socket x 4-core x 2 Hyperthreads with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores, 256 Mb RAM and QDR Infiniband (smp1 and smp2). * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 036b1e5ab02629324e6f70b593c440d53b861250 147 146 2010-07-21T12:36:43Z Cjoslin 5 not 16 core head node wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons 1 socket x 4-core x 2 Hyperthreads with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons 1 socket x 4-core x 2 Hyperthreads with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores, 256 Mb RAM and QDR Infiniband (smp1 and smp2). * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg dbced85d9ea4b071e101c870ff2e8e0344875940 150 147 2010-07-22T14:29:53Z Cjoslin 5 Corrected nodes info wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons(E5520s) 2 socket x 4-core Hyperthreading off with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons(E5520s) 2 socket x 4-core Hyperthreading off with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores, 256 Mb RAM and QDR Infiniband (smp1 and smp2). * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 69813107400e0db9da7d8d21505d76f8e6ab549b MPI 0 12 126 60 2010-06-24T21:55:02Z Mjh 2 /* MVAPICH2 */ mpiexec works now wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command mpi-selector-menu on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 6d5d5ce91b07c6963968acdcee899c0378cba10b 127 126 2010-06-28T16:18:40Z Mjh 2 /* MVAPICH2 */ wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command <tt>mpi-selector-menu</tt> on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 2282d70549048621a9c70a8b911e4af96e8f711c 134 127 2010-07-07T13:32:40Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Infiniband network, but will not use it natively: thus the latency and bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. For documentation of <tt>mpiexec</tt>, see [http://www.osc.edu/~djohnson/mpiexec/index.php]. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). If for some reason you want to alter this default behaviour (e.g. if your MPI code needs to run multi-threaded, as would be the case if you were mixing MPI and OpenMP), then <tt>mpiexec</tt> has options to allow this: run <tt>/usr/local/bin/mpiexec</tt> on the head node with no arguments to get a list of possible options. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command <tt>mpi-selector-menu</tt> on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. b367fe1e9f230b735a8a4ebd2c3778d8eec2572c SMP machines 0 24 136 2010-07-13T07:01:22Z Mjh 2 Created page with 'The SMP machines are two 4-processor, 48-core systems each with 256 Mb of RAM. The individual cores in these machines are slightly slower (~ 20% on CPU-bound floating-point oper…' wikitext text/x-wiki The SMP machines are two 4-processor, 48-core systems each with 256 Mb of RAM. The individual cores in these machines are slightly slower (~ 20% on CPU-bound floating-point operations) than the Intel CPUs of the main computing nodes. So users are recommended to use the main part of the cluster for straightforward computing requirements that do not need the special features of the SMP machines. The big advantage of the SMP machines is the large amount of physical memory visible to all 48 cores. This allows for multi-threaded, shared-memory applications. The SMP machines also each have 10 Tb of local scratch disc space (mounted as /scratch). Jobs may be started on the SMP machines using the <tt>smp</tt> [[queues|queue]] in Torque. We will also permit direct login in certain circumstances. Please discuss your requirements with us. == Restrictions == The SMP machines are 'at risk' until further notice while we continue to configure them. smp2 does not have its scratch disc set up yet due to a faulty hard drive. The SMP machines are running FC13 and Linux 2.6.35-rc4 (!). This is slightly different from other nodes of the cluster; please be alert to problems this may cause. Infiniband-aware MPI code will not run on the SMP machines, as the libraries are not yet installed (the underlying OS is too new for the OFED packages to compile). Since it is probably not sensible to run jobs spanning the main cluster and the smp machines (and since this is not currently possible via Torque in any case) this should not be a serious restriction. 040ef8a2a074eb69c03a14c39dca079bd45cd481 142 136 2010-07-13T07:28:21Z Mjh 2 wikitext text/x-wiki The SMP machines are two 4-processor, 48-core systems each with 256 Gb of RAM. The individual cores in these machines are slightly slower (~ 20% on CPU-bound floating-point operations) than the Intel CPUs of the main computing nodes. So users are recommended to use the main part of the cluster for straightforward computing requirements that do not need the special features of the SMP machines. The big advantage of the SMP machines is the large amount of physical memory visible to all 48 cores. This allows for multi-threaded, shared-memory applications. The SMP machines also each have 10 Tb of local scratch disc space (mounted as /scratch). Jobs may be started on the SMP machines using the <tt>smp</tt> [[queues|queue]] in Torque. We will also permit direct login in certain circumstances. Please discuss your requirements with us. == Restrictions == The SMP machines are 'at risk' until further notice while we continue to configure them. smp2 does not have its scratch disc set up yet due to a faulty hard drive. The SMP machines are running FC13 and Linux 2.6.35-rc4 (!). This is slightly different from other nodes of the cluster; please be alert to problems this may cause. Infiniband-aware MPI code will not run on the SMP machines, as the libraries are not yet installed (the underlying OS is too new for the OFED packages to compile). Since it is probably not sensible to run jobs spanning the main cluster and the smp machines (and since this is not currently possible via Torque in any case) this should not be a serious restriction. 49ecb818c970b99aceab7d32aef738517b8908c1 Main Page 0 1 137 69 2010-07-13T07:01:49Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] ad43eb102dc4e0355d37645478e34d0404843c02 Queues 0 15 138 87 2010-07-13T07:03:11Z Mjh 2 wikitext text/x-wiki There are five possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * Finally 'all' submits to all 80 nodes, with a maximum wall time of 2 hours; note that MPI over Infiniband will not work if you request nodes spanning both the main and CAIR clusters. This queue should only be used in unusual circumstances -- normally it will be better to use one of the others. * 'smp' submits to the two [[SMP machines]]. == Default wall times == The default wall time limitation for the 'all' and 'cair_s' queues is also the maximum, i.e. 6 hours and 2 hours respectively. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. 014e0f8534164bf1d059b30e2f97f342c9add5ce 139 138 2010-07-13T07:04:15Z Mjh 2 wikitext text/x-wiki There are five possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * Finally 'all' submits to all 80 nodes, with a maximum wall time of 2 hours; note that MPI over Infiniband will not work if you request nodes spanning both the main and CAIR clusters. This queue should only be used in unusual circumstances -- normally it will be better to use one of the others. * 'smp' is a special queue that submits to the two [[SMP machines]]. It is not currently possible to run jobs that span the SMP machines and the main or CAIR clusters. == Default wall times == The default wall time limitation for the 'all' and 'cair_s' queues is also the maximum, i.e. 6 hours and 2 hours respectively. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. 25f046a1287eaa9f6ce161fa60e7bebcad0cb5bf Networking 0 10 140 26 2010-07-13T07:06:55Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. Management traffic also uses this switch. There are in fact two infiniband networks: one for the main cluster, which is dual data-rate (nominal 2.5 Gbit/s) and one for the CAIR cluster, which is quad data-rate (5.0 Gbit/s). As for the ethernet, each chassis has an internal Infiniband switch. The two sub-clusters each have an external Infiniband switch which connects the chassis that make up the cluster and the head node is connected, separately, to both of those switches. Therefore there is no native infiniband connectivity between the two sub-clusters. IP over Infiniband can be used to connect between them, but this should not be relied on for large data volumes as it must be routed via the head node. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency and data transfer rates are somewhat higher between nodes in the same chassis than between different chassis in the same cluster, and ethernet connections between the two sub-clusters are higher-latency and lower-bandwidth still, while native Infiniband connections between the two sub-clusters are essentially impossible. Best results will be obtained running jobs within a single chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx or 4.xxx, where the 3 and 4 refer to the main and CAIR clusters respectively). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The two SMP machines are attached to the Infiniband network of the main cluster and are called smp1 and smp2: thus they have addresses smp1.data, smp1.infi etc. 064d9d69c0e57bf732c7bf7740f11a683c722412 143 140 2010-07-20T12:42:53Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. Management traffic also uses this switch. The links connecting the head node and the SMP machines, and the uplinks from the two node internal ethernet stacks, are carried by multiple physical cables via link aggregation. There are in fact two infiniband networks: one for the main cluster, which is dual data-rate (nominal 2.5 Gbit/s) and one for the CAIR cluster, which is quad data-rate (5.0 Gbit/s). As for the ethernet, each chassis has an internal Infiniband switch. The two sub-clusters each have an external Infiniband switch which connects the chassis that make up the cluster and the head node is connected, separately, to both of those switches. Therefore there is no native infiniband connectivity between the two sub-clusters. IP over Infiniband can be used to connect between them, but this should not be relied on for large data volumes as it must be routed via the head node. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency and data transfer rates are somewhat higher between nodes in the same chassis than between different chassis in the same cluster, and ethernet connections between the two sub-clusters are higher-latency and lower-bandwidth still, while native Infiniband connections between the two sub-clusters are essentially impossible. Best results will be obtained running jobs within a single chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx or 4.xxx, where the 3 and 4 refer to the main and CAIR clusters respectively). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The two SMP machines are attached to the Infiniband network of the main cluster and are called smp1 and smp2: thus they have addresses smp1.data, smp1.infi etc. b525a5ffe8b47c0504f134287432979336be223c Storage 0 8 141 17 2010-07-13T07:08:10Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 65 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large astronomical datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. 418f5eafc7007c1a62b5b5061d168d982b2cab1e Jobs 0 9 144 86 2010-07-20T13:25:44Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001.infi:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. d6f7d92175cfeb22d1d44d49a9e037f80f28cda5 Architecture 0 7 151 150 2010-08-25T12:10:07Z Cjoslin 5 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons(E5520s) 2 socket x 4-core Hyperthreading off with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons(E5520s) 2 socket x 4-core Hyperthreading off with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores, 256 Gb RAM and QDR Infiniband (smp1 and smp2). * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 7d3ebab032590efa2e6fbab1948e38a4fa7fbb93 164 151 2010-12-06T12:17:05Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg 24f495ddfb026d10cab4c30f89e68eb90a51d22a 198 164 2011-03-23T15:55:17Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 4 sandbox machines (each with 2 x Xeon E5345 4-core CPUs, 8 Gb RAM, no infiniband) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg fe532d6bd3f2c0bb5dd9ffb2c02fe872f8fb35c9 MPI 0 12 152 134 2010-09-08T12:22:36Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Ethernet network, and so the latency and total available bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. For documentation of <tt>mpiexec</tt>, see [http://www.osc.edu/~djohnson/mpiexec/index.php]. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). If for some reason you want to alter this default behaviour (e.g. if your MPI code needs to run multi-threaded, as would be the case if you were mixing MPI and OpenMP), then <tt>mpiexec</tt> has options to allow this: run <tt>/usr/local/bin/mpiexec</tt> on the head node with no arguments to get a list of possible options. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command <tt>mpi-selector-menu</tt> on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 58be49ca4df8c782bd850c92a5212fc9465ae7e9 156 152 2010-10-15T07:22:29Z Mjh 2 /* MVAPICH2 */ wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Ethernet network, and so the latency and total available bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. For documentation of <tt>mpiexec</tt>, see [http://www.osc.edu/~djohnson/mpiexec/index.php]. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). If for some reason you want to alter this default behaviour (e.g. if your MPI code needs to run multi-threaded, as would be the case if you were mixing MPI and OpenMP), then <tt>mpiexec</tt> has options to allow this: run <tt>/usr/local/bin/mpiexec</tt> on the head node with no arguments to get a list of possible options. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVIPICH2 run the command <tt>mpi-selector-menu</tt> on the head node and choose the appropriate option. When you next log in, your path and libraries will be set up correctly: <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs (but see [[Known problems]]). <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected using <tt>mpi-selector-menu</tt>. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 100deabe4e89b05ddf9f7a56da9c8e46580c554c 194 156 2011-03-21T21:34:51Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Ethernet network, and so the latency and total available bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. For documentation of <tt>mpiexec</tt>, see [http://www.osc.edu/~djohnson/mpiexec/index.php]. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). If for some reason you want to alter this default behaviour (e.g. if your MPI code needs to run multi-threaded, as would be the case if you were mixing MPI and OpenMP), then <tt>mpiexec</tt> has options to allow this: run <tt>/usr/local/bin/mpiexec</tt> on the head node with no arguments to get a list of possible options. === MPICH2 (local versions) === Locally compiled, more up-to-date versions of MPICH2 are available. To use these use <tt>modules</tt> commands: do <pre> module unload mpich2-x86_64 module load mpich2-local OR module load mpich2-intel </pre> Then <pre> which mpicc /soft/mpich2/bin/mpicc </pre> If you wish to use these permanently, then you are recommended to put these module commands in your .cshrc or .bashrc. Currently you should run MPI jobs of this sort using the built-in <tt>mpiexec</tt> command, which is Torque-aware. For example, <pre> #!/bin/sh -f #PBS -N sandbox-mpi #PBS -m abe #PBS -l nodes=4:ppn=8 #PBS -k oe #PBS -q sandbox #PBS -l walltime=00:02:00 echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: array ID is $PBS_ARRAYID echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /soft/mpich2/bin/mpiexec -rmk pbs /home/mjh/c/mpi/examples/basic/cpi echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVAPICH2 do <pre> module unload mpich2-x86_64 module load mvapich2 </pre> Then you should see <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs (but see [[Known problems]]). <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected via the [[modules]] system. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 080752f9469dca13ee5b32c3a351f62f48bcc15f Storage 0 8 153 141 2010-09-16T17:10:52Z Mjh 2 policy, backups wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 65 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large astronomical datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . ed090bf31b2487c7b4a2538824765003e79dd8f0 Main Page 0 1 154 137 2010-10-15T07:03:57Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] b08b82541e7f4bbb44b8a3c3b0f66f823dcbaaa2 168 154 2010-12-06T12:54:15Z Mjh 2 /* Using the cluster */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Tesla]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] 3f50b7b77461de898e2f5e4c2388b690684cbd70 178 168 2011-02-18T09:02:25Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Tesla]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] 04b66f658064190574f1d7c5abcede3ba188cec5 185 178 2011-03-10T15:11:57Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] ca65dd0e1cafd49d3fbf7b6b6e553f53d0d77558 188 185 2011-03-16T20:27:31Z Mjh 2 /* Using the cluster */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] 2e03eddd92d8261297dea022398a632339d142a7 191 188 2011-03-21T20:57:11Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] df88f7325255b17bd79b96216e058530626f62a4 Known problems 0 25 155 2010-10-15T07:21:28Z Mjh 2 Created page with '== Known problems == * Nodes 001-008 of the main cluster are powered off, and have been for some months. This is because the air conditioning capacity in the server room is not …' wikitext text/x-wiki == Known problems == * Nodes 001-008 of the main cluster are powered off, and have been for some months. This is because the air conditioning capacity in the server room is not adequate. We are working with estates to solve this problem. * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA (although they should be) and so their latency and bandwidth is lower than it should be. This seems to be a problem with the Linux kernels on the nodes and will only be solved by an upgrade. * The Fedora distribution on the nodes (FC12) is rather old in general and should probably be upgraded. * There is a problem on certain nodes whereby mpiexec fails to start MVAPICH2 jobs: errors of the following form are seen: <pre> Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(311).......: Initialization failed MPID_Init(191)..............: channel initialization failed MPIDI_CH3_Init(163).........: MPIDI_CH3I_RDMA_init(184)...: rdma_setup_startup_ring(373): cannot create cq </pre> At present we don't understand this problem or why it only happens on certain nodes. A workround is to use the old ssh-based method of starting jobs, as implemented in the script <tt>/soft/bin/torque-mv</tt> . This script is less flexible and less well integrated with Torque than mpiexec, but it does work. Note that mpiexec still works, and is still the recommended method, for jobs using MPICH2. 9827dc9142b830a47acb37c33cc1b4fa2f6d914a 163 155 2010-12-06T12:14:31Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Known problems == * Nodes 001-008 of the main cluster are powered off, and have been for some months. This is because the air conditioning capacity in the server room is not adequate. We are working with estates to solve this problem. As of Dec 2010 Estates have accepted that they are responsible for the problem and are planning to rectify it by installing additional ACUs. * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA (although they should be) and so their latency and bandwidth is lower than it should be. This seems to be a problem with the Linux kernels on the nodes and will only be solved by an upgrade. * The Fedora distribution on the nodes (FC12) is rather old in general and should probably be upgraded. * There is a problem on certain nodes whereby mpiexec fails to start MVAPICH2 jobs: errors of the following form are seen: <pre> Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(311).......: Initialization failed MPID_Init(191)..............: channel initialization failed MPIDI_CH3_Init(163).........: MPIDI_CH3I_RDMA_init(184)...: rdma_setup_startup_ring(373): cannot create cq </pre> At present we don't understand this problem or why it only happens on certain nodes. A workround is to use the old ssh-based method of starting jobs, as implemented in the script <tt>/soft/bin/torque-mv</tt> . This script is less flexible and less well integrated with Torque than mpiexec, but it does work. Note that mpiexec still works, and is still the recommended method, for jobs using MPICH2. 3687ea24477f9e96c3bf35a398dbf9dda5162861 170 163 2011-02-03T11:53:49Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA (although they should be) and so their latency and bandwidth is lower than it should be. This seems to be a problem with the Linux kernels on the nodes and will only be solved by an upgrade. * The Fedora distribution on the nodes (FC12) is rather old in general and should probably be upgraded. * There is a problem on certain nodes whereby mpiexec fails to start MVAPICH2 jobs: errors of the following form are seen: <pre> Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(311).......: Initialization failed MPID_Init(191)..............: channel initialization failed MPIDI_CH3_Init(163).........: MPIDI_CH3I_RDMA_init(184)...: rdma_setup_startup_ring(373): cannot create cq </pre> At present we don't understand this problem or why it only happens on certain nodes. A workround is to use the old ssh-based method of starting jobs, as implemented in the script <tt>/soft/bin/torque-mv</tt> . This script is less flexible and less well integrated with Torque than mpiexec, but it does work. Note that mpiexec still works, and is still the recommended method, for jobs using MPICH2. 77c4c2873799105dbc2e7dc32c576e1d2d374bb7 171 170 2011-02-03T11:55:20Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA (although they should be) and so their latency and bandwidth is lower than it should be. This seems to be a problem with the Linux kernels on the nodes and will only be solved by an upgrade. * The Fedora distribution on the nodes (FC12) is rather old in general and should probably be upgraded. * There is a problem on certain nodes whereby mpiexec fails to start MVAPICH2 jobs: errors of the following form are seen: <pre> Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(311).......: Initialization failed MPID_Init(191)..............: channel initialization failed MPIDI_CH3_Init(163).........: MPIDI_CH3I_RDMA_init(184)...: rdma_setup_startup_ring(373): cannot create cq </pre> At present we don't understand this problem or why it only happens on certain nodes. A workround is to use the old ssh-based method of starting jobs, as implemented in the script <tt>/soft/bin/torque-mv</tt> . This script is less flexible and less well integrated with Torque than mpiexec, but it does work. Note that mpiexec still works, and is still the recommended method, for jobs using MPICH2. * I/O intensive operations on the data or home directories can cause nodes to lock up in some situations. This only happens occasionally, but is clearly associated with NFS activity. We hope that an upgrade of the kernels (see above) will solve this problem too. 4b1357d151da9c4b0161395616ff34f29cf3fd90 199 171 2011-03-24T04:00:19Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA (although they should be) and so their latency and bandwidth is lower than it should be. This seems to be a problem with the Linux kernels on the nodes and will only be solved by an upgrade. * The Fedora distribution on the nodes (FC12) is rather old in general and should probably be upgraded. * There is a problem on certain nodes whereby mpiexec fails to start MVAPICH2 jobs: errors of the following form are seen: <pre> Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(311).......: Initialization failed MPID_Init(191)..............: channel initialization failed MPIDI_CH3_Init(163).........: MPIDI_CH3I_RDMA_init(184)...: rdma_setup_startup_ring(373): cannot create cq </pre> At present we don't understand this problem or why it only happens on certain nodes. A workround is to use the old ssh-based method of starting jobs, as implemented in the script <tt>/soft/bin/torque-mv</tt> . This script is less flexible and less well integrated with Torque than mpiexec, but it does work. Note that mpiexec still works, and is still the recommended method, for jobs using MPICH2. * I/O intensive operations on the data or home directories can cause nodes to lock up in some situations. This only happens occasionally, but is clearly associated with NFS activity. We hope that an upgrade of the kernels (see above) will solve this problem too. See also a list of [[actions for upgrade]]. 590859734f2843f81d95bddd8723f54f3e90c3ef 211 199 2011-04-22T21:21:48Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA (although they should be) and so their latency and bandwidth is lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * I/O intensive operations on the data or home directories can cause nodes to lock up in some situations. This only happens occasionally, but is clearly associated with NFS activity. See also a list of [[actions for upgrade]]. e0b11cde708a7c1ae72c6788bbfcc5db31de7a40 Policies 0 4 157 47 2010-10-15T07:28:09Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing. * The normal method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly unless you have been given specific authorization to do so (e.g. if you have got approval to dedicate one or more nodes to data reduction tasks). * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * We recommend that you write code with an awareness of the physical memory available in each machine (see [[Architecture]]) but, more importantly, if you are likely to exceed the physical memory limits, and may push the Linux kernel to start killing random processes, please make sure that you do not do this on any machine that may be shared with others -- i.e., make sure that you have exclusive use of any node on which you are going to take such risks. 211f1cabb85994a6458a9b3201763bbd4f5b38a2 162 157 2010-11-23T15:58:27Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing. * The normal method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly unless you have been given specific authorization to do so (e.g. if you have got approval to dedicate one or more nodes to data reduction tasks). * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * We recommend that you write code with an awareness of the physical memory available in each machine (see [[Architecture]]) but, more importantly, if you are likely to exceed the physical memory limits, and may push the Linux kernel to start killing random processes, please make sure that you do not do this on any machine that may be shared with others -- i.e., make sure that you have exclusive use of any node on which you are going to take such risks. * There is a fair-share policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that if you have been making heavy use of the cluster on a timescale of days (to be exact, a weighted integral over the past week is taken) future jobs will be given a lower priority; you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on this timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. 4a9d2c74d656541e5cbf7677986292d123f6ba0a 207 162 2011-04-22T20:39:30Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing. * The normal method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly unless you have been given specific authorization to do so (e.g. if you have got approval to dedicate one or more nodes to data reduction tasks). Even then, it would be best to use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * We recommend that you write code with an awareness of the physical memory available in each machine (see [[Architecture]]) but, more importantly, if you are likely to exceed the physical memory limits, and may push the Linux kernel to start killing random processes, please make sure that you do not do this on any machine that may be shared with others -- i.e., make sure that you have exclusive use of any node on which you are going to take such risks. * There is a fair-share policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that if you have been making heavy use of the cluster on a timescale of days (to be exact, a weighted integral over the past week is taken) future jobs will be given a lower priority; you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on this timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. f9d9447d14aa2e13ea8216b0e1ac4947c0818785 Queues 0 15 158 139 2010-11-04T14:42:46Z Mjh 2 wikitext text/x-wiki There are five possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' is a special queue that submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. * Finally 'all' submits to all 80 nodes and the two SMP machines, with a maximum wall time of 2 hours; note that MPI over Infiniband will not work if you request nodes spanning both the main and CAIR clusters; MPI over Infiniband does not work at all on the [[SMP machines]]. This queue should only be used in unusual circumstances -- normally it will be better to use one of the others. == Default wall times == The default wall time limitation for the 'all' and 'cair_s' queues is also the maximum, i.e. 2 hours and 6 hours respectively. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. 6699f8020c3e685d3077debc35dba6e760610668 160 158 2010-11-17T13:03:23Z Mjh 2 wikitext text/x-wiki There are four possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' is a special queue that submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. == Default wall times == The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. aa400a9a90b391262e6c74d1c882301fdcd54262 161 160 2010-11-17T13:08:03Z Mjh 2 wikitext text/x-wiki There are four possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. There are no other limitations on this queue. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' is a special queue that submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. == Default wall times == The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. 091a3fcb9383b60fffac6460ff84a8444f7abe4d 187 161 2011-03-10T15:19:22Z Mjh 2 wikitext text/x-wiki There are five possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' is a special queue that submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. 06abe04575ba7e6978c923a289acbac70bbc71cd SMP machines 0 24 159 142 2010-11-04T14:44:11Z Mjh 2 wikitext text/x-wiki The SMP machines are two 4-processor, 48-core systems each with 256 Gb of RAM. The individual cores in these machines are slightly slower (~ 20% on CPU-bound floating-point operations) than the Intel CPUs of the main computing nodes. So users are recommended to use the main part of the cluster for straightforward computing requirements that do not need the special features of the SMP machines. The big advantage of the SMP machines is the large amount of physical memory visible to all 48 cores. This allows for multi-threaded, shared-memory applications. The SMP machines also each have 10 Tb of local scratch disc space (mounted as /scratch). Jobs may be started on the SMP machines using the <tt>smp</tt> [[queues|queue]] in Torque. We will also permit direct login in certain circumstances. Please discuss your requirements with us. == Restrictions == The SMP machines are running FC13 and Linux 2.6.35-rc4 (!). This is slightly different from other nodes of the cluster; please be alert to problems this may cause. Infiniband-aware MPI code will not run on the SMP machines, as the libraries are not yet installed (the underlying OS is too new for the OFED packages to compile). Since it is probably not sensible to run jobs spanning the main cluster and the smp machines because of the difference in processing speeds, this should not be a serious restriction. 82a8eaa224c4eea457f4fb18547ba793fa508017 Jobs 0 9 169 144 2011-01-06T10:30:23Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. 1f23e1c506d4e9850bf88b9a6e7730dc14220552 172 169 2011-02-08T14:35:38Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the second request above would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. 25a1865f14f2c3a951d62ded4a52fded64e2f9ed 203 172 2011-03-28T18:43:40Z Mjh 2 /* Basic commands */ wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. e329ead361b391e4c839da675b984b0c9c09e544 210 203 2011-04-22T21:20:35Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. babeb6cb3182406bd4e7d00bf3e2dfd9f6a8d751 Software 0 17 173 114 2011-02-09T15:02:43Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in /soft/aips * <us>[[CASA]]</u>: 3.1.0 installed in /soft/casapy-31.0.13530-002-64b 73bb54a4cb6985265678a426dba2f3f030aac0dd 177 173 2011-02-09T15:14:45Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in /soft/aips * <u>[[CASA]]</u>: 3.1.0 installed in /soft/casapy-31.0.13530-002-64b 4f3ef4c8ce8112fdac5324f8ad68784dbcb26de9 197 177 2011-03-23T15:53:13Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in /soft/aips * <u>[[CASA]]</u>: 3.1.0 installed in /soft/casapy-31.0.13530-002-64b * <u>[[IDL]]</u>: in /soft/idl/idl/bin 16000f4c2480e51e91e71a7d87d8ecb9df1933af AIPS 0 27 174 2011-02-09T15:09:06Z Mjh 2 Created page with 'AIPS is software for radio astronomy data reduction. It is installed on the cluster in /soft/aips . To use aips you will need to be in the aipsuser group. From the head node, l…' wikitext text/x-wiki AIPS is software for radio astronomy data reduction. It is installed on the cluster in /soft/aips . To use aips you will need to be in the aipsuser group. From the head node, log in to the machine you have been instructed to use -- aips cannot be used through the batch job system -- using e.g. <tt>ssh -X smp1</tt>. Then do <tt>/soft/aips/START_AIPS tv=local</tt>. Disc 1 will be a local disc. Optionally, do <tt>soft/aips/START_AIPS tv=local da=stri-cluster</tt> to get access to the cluster data area -- but you are recommended not to try to use this for data reduction. 870b53dfea61ca729328e780487e8f287064d22e 175 174 2011-02-09T15:09:28Z Mjh 2 wikitext text/x-wiki AIPS is software for radio astronomy data reduction. It is installed on the cluster in /soft/aips . To use aips you will need to be in the aipsuser group. From the head node, log in to the machine you have been instructed to use -- aips cannot be used through the batch job system -- using e.g. <tt>ssh -X smp1</tt>. Then do <tt>/soft/aips/START_AIPS tv=local</tt>. Disc 1 will be a local disc. Optionally, do <tt>/soft/aips/START_AIPS tv=local da=stri-cluster</tt> to get access to the cluster data area -- but you are recommended not to try to use this for data reduction. 08296865773817205b810e728b6fcb84cc9750ed CASA 0 28 176 2011-02-09T15:13:33Z Mjh 2 Created page with 'CASA is software for radio astronomy data reduction. It is installed on the cluster at /soft/casapy-31.0.13530-002-64b/ . To use casa, do <tt>setenv PATH /soft/casapy-31.0.13530…' wikitext text/x-wiki CASA is software for radio astronomy data reduction. It is installed on the cluster at /soft/casapy-31.0.13530-002-64b/ . To use casa, do <tt>setenv PATH /soft/casapy-31.0.13530-002-64b:$PATH</tt> and then run it with <tt>casapy</tt>. You should not run CASA on the head node: either run it through the batch job system or log into a node that you have been assigned for interactive use. 85931352a08f3710286a57981d1c1b9e7db38a64 Acknowledgements 0 29 179 2011-02-18T09:07:14Z Mjh 2 Created page with 'If possible, please say explicitly that you have used the STRI cluster in any paper you publish that makes use of results obtained using it. We don't insist on any explicit form…' wikitext text/x-wiki If possible, please say explicitly that you have used the STRI cluster in any paper you publish that makes use of results obtained using it. We don't insist on any explicit form of words, but an example might be 'This work has made use of the University of Hertfordshire Science and Technology Research Institute high-performance computing facility.' The cluster doesn't really have an outward-facing web presence, but [http://star.herts.ac.uk/progs/computing.html] might be of some use to some people. Please also add details of any submitted, accepted or published paper using the cluster to the [[Bibliography]] page. 8b02997bce791b69736c8a4cade949980007c01e 180 179 2011-02-18T09:07:40Z Mjh 2 wikitext text/x-wiki If possible, please say explicitly that you have used the STRI cluster in any paper you publish that makes use of results obtained using it. We don't insist on any explicit form of words, but an example might be 'This work has made use of the University of Hertfordshire Science and Technology Research Institute high-performance computing facility.' The cluster doesn't really have an outward-facing web presence, but [http://star.herts.ac.uk/progs/computing.html] might be of some use to some people. Please also add details of any submitted, accepted or published paper using the cluster to the [[Cluster bibliography]] page. 1c32780b985605c824693f4a3a30a6e17cd27a7d Cluster bibliography 0 30 181 2011-02-18T09:09:21Z Mjh 2 Created page with 'Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. * Hardcastle MJ, Croston JH, Modelling TeV gamm…' wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, submitted to MNRAS, 2011 Feb e5b83df9e8f495e07df875bfe03f47eba14db8b8 182 181 2011-02-18T11:31:56Z Mjh 2 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, submitted to MNRAS, 2011 Feb b27d0993585254041b56190c96d900aefee67b67 183 182 2011-02-18T12:37:06Z Gsousa 6 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, submitted to MNRAS, 2011 Feb * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at BMC Neuroscience 2010 dd4d54d2c542325113c4b5e59d04ceb72d00185e 184 183 2011-02-18T12:41:52Z Karen 7 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, submitted to MNRAS, 2011 Feb * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at BMC Neuroscience 2010 * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at BMC Neuroscience 11, P92 553945b76108dcab3600f17e7c7ea4eb52d22bd4 190 184 2011-03-16T20:30:35Z Mjh 2 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, accepted by MNRAS, 2011 Mar * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at BMC Neuroscience 2010 * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at BMC Neuroscience 11, P92 891cac386a82e7e2441751ec933b1eb81071f8db Web server 0 32 189 2011-03-16T20:29:27Z Mjh 2 Created page with 'If you create a directory <tt>public_html</tt> in your home directory, then its contents are visible at <tt>http://stri-cluster.herts.ac.uk/~your-username/</tt>. Like all other s…' wikitext text/x-wiki If you create a directory <tt>public_html</tt> in your home directory, then its contents are visible at <tt>http://stri-cluster.herts.ac.uk/~your-username/</tt>. Like all other stri-cluster web pages, this is not visible outside the University, but you may use this facility to export data etc within the university. c948dde882b27f8f96b6bf6cb406be0bc6eb70f5 Modules 0 33 192 2011-03-21T21:12:40Z Mjh 2 Created page with 'The cluster uses the <tt>environment-modules</tt> package to manage environment variable settings for software packages. When a 'module' is loaded, the appropriate environment se…' wikitext text/x-wiki The cluster uses the <tt>environment-modules</tt> package to manage environment variable settings for software packages. When a 'module' is loaded, the appropriate environment settings are added to the start of your PATH, MANPATH, LD_LIBRARY_PATH etc, and other variables used by the relevant package may be set as well. When a module is unloaded, the changes made to environment variables are undone. Documentation of this package is available at this link[http://modules.sourceforge.net/], or type <tt>man module</tt> Basic commands include: * <tt>module list</tt>. See what modules you have loaded. * <tt>module avail</tt>. List what modules are available to you. * <tt>module load [modulename]</tt>. Load a module, e.g. <tt>module load mpich2-local</tt> * <tt>module unload [modulename]</tt>. Unload a module. Modules currently available are: * <tt>mpich2-x86_64</tt>: Fedora standard implementation of MPICH2. Loaded by default. * <tt>mpich2-local</tt>: Local version of mpich2. Will generally be more up-to-date than <tt>mpich2-x86_64</tt>. * <tt>mpich2-intel</tt>: A version of mpich2 compiled with the Intel compiler. * <tt>mvapich2></tt>: The MVAPICH2 implementation of [[MPI]]. * <tt>OpenMPI></tt>: The OpenMPI implementation of [[MPI]]. You may use <tt>module</tt> commands in your .bashrc or .cshrc. For example, I have <pre> module unload mpich2-x86_64 module load mpich2-local </pre> as the first two lines of my .cshrc. We are happy to add other environments as modules -- please contact the cluster [[Administrators]]. 79c51d1143103a668739a60dcb417fdced66b7c6 193 192 2011-03-21T21:26:16Z Mjh 2 wikitext text/x-wiki The cluster uses the <tt>environment-modules</tt> package to manage environment variable settings for software packages. When a 'module' is loaded, the appropriate environment settings are added to the start of your PATH, MANPATH, LD_LIBRARY_PATH etc, and other variables used by the relevant package may be set as well. When a module is unloaded, the changes made to environment variables are undone. Documentation of this package is available at this link[http://modules.sourceforge.net/], or type <tt>man module</tt> Basic commands include: * <tt>module list</tt>. See what modules you have loaded. * <tt>module avail</tt>. List what modules are available to you. * <tt>module load [modulename]</tt>. Load a module, e.g. <tt>module load mpich2-local</tt> * <tt>module unload [modulename]</tt>. Unload a module. Modules currently available are: * <tt>mpich2-x86_64</tt>: Fedora standard implementation of MPICH2. Loaded by default. * <tt>mpich2-local</tt>: Local version of mpich2. Will generally be more up-to-date than <tt>mpich2-x86_64</tt>. * <tt>mpich2-intel</tt>: A version of mpich2 compiled with the Intel compiler. * <tt>mvapich2</tt>: The MVAPICH2 implementation of [[MPI]]. * <tt>OpenMPI</tt>: The OpenMPI implementation of [[MPI]]. You may use <tt>module</tt> commands in your .bashrc or .cshrc. For example, I have <pre> module unload mpich2-x86_64 module load mpich2-local </pre> as the first two lines of my .cshrc. We are happy to add other environments as modules -- please contact the cluster [[Administrators]]. d9542746c51cb5ab8fa71561d7050ca8cafe8de4 Interactive jobs 0 35 208 2011-04-22T21:18:59Z Mjh 2 Created page with 'Interactive jobs are jobs which give you an interactive session on one of the compute nodes. Importantly, accessing the compute nodes this way means that the job control system g…' wikitext text/x-wiki Interactive jobs are jobs which give you an interactive session on one of the compute nodes. Importantly, accessing the compute nodes this way means that the job control system guarantees the resources that you have asked for. If you simply log into a compute node with ssh (which is in any case strongly discouraged by the cluster access [[policies]]), another user's job may compete with what you are trying to do. Therefore you should, if possible, always use the interactive job facility to run interactively on the compute nodes. == Running an interactive job == An interactive job is run using the <tt>qsub</tt> command as for normal [[jobs]]. However, the result of running an interactive job is that you are logged on to the target machine. For example, <pre> [user@stri-cluster ~]$ qsub -l walltime=00:30:00 -I -q main qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@node047 ~]$ </pre> In this example the user has requested a 30-minute session on any node in the main cluster (the default request is one processor on one node). A node is available to run the job and so the user finds herself at a shell prompt. She may now use the node interactively for 30 minutes. At the end of the 30 minutes (wall time, not CPU time!) the job will end and the session will be terminated. If the user logs out before then, the job will terminate early. Note that, just as with ordinary jobs, you must honour the allocation you are given. If you ask for one CPU on one node, you must only use that. A number of environment variables beginning with PBS (try <tt>printenv | grep PBS</tt> are provided to tell you the job environment in which you are running in case you have forgotten. If your request for an interactive shell cannot be fulfilled, the qsub command will wait until it can. == Advanced topics == === Multiple CPUs === If you know you want a machine completely dedicated to you (e.g. because you plan to run multithreaded code interactively) then you must explicitly request that: e.g., <pre> qsub -l walltime=24:00:00 -l nodes=1:ppn=48 -I -q smp </pre> will reserve all 48 cores of one of the [[SMP machines]] for you for a day. === Multiple nodes === In the unlikely situation in which you want to run interactively on several nodes at once, you will only be given one shell. You can use the <tt>pbsdsh</tt> command to run the same commands on each of the processors you have been allocated, or /usr/local/bin/mpiexec to run [[MPI]] jobs. <pre> qsub -l walltime=00:01:00 -l nodes=2:ppn=2 -I -q smp qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@smp2 ~]$ pbsdsh hostname smp2 smp1 smp1 smp2 </pre> === X forwarding === If you want to run jobs that use the X windows system, you may turn on X11 forwarding using the <tt>-X</tt> option. (This is like the <tt>-X</tt> option to <tt>ssh</tt>. === Walltime requests === Please be considerate in your walltime requests. While you have an interactive job running, no other user can use the resources you have allocated. Therefore, you should not set an interactive job running for a week and walk away. Jobs that appear to be reserving resources without using them will be terminated. Obviously the best strategy is to select a walltime that roughly matches what you want and exit before the time is up. 234d72f2f705a765b736a5ed2f98f9b14c87def3 209 208 2011-04-22T21:19:24Z Mjh 2 wikitext text/x-wiki Interactive jobs are jobs which give you an interactive session on one of the compute nodes. Importantly, accessing the compute nodes this way means that the job control system guarantees the resources that you have asked for. If you simply log into a compute node with ssh (which is in any case strongly discouraged by the cluster access [[policies]]), another user's job may compete with what you are trying to do. Therefore you should, if possible, always use the interactive job facility to run interactively on the compute nodes. == Running an interactive job == An interactive job is run using the <tt>qsub</tt> command as for normal [[jobs]]. However, the result of running an interactive job is that you are logged on to the target machine. For example, <pre> [user@stri-cluster ~]$ qsub -l walltime=00:30:00 -I -q main qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@node047 ~]$ </pre> In this example the user has requested a 30-minute session on any node in the main cluster (the default request is one processor on one node). A node is available to run the job and so the user finds herself at a shell prompt. She may now use the node interactively for 30 minutes. At the end of the 30 minutes (wall time, not CPU time!) the job will end and the session will be terminated. If the user logs out before then, the job will terminate early. Note that, just as with ordinary jobs, you must honour the allocation you are given. If you ask for one CPU on one node, you must only use that. A number of environment variables beginning with PBS (try <tt>printenv | grep PBS</tt> are provided to tell you the job environment in which you are running in case you have forgotten. If your request for an interactive shell cannot be fulfilled, the qsub command will wait until it can. == Advanced topics == === Multiple CPUs === If you know you want a machine completely dedicated to you (e.g. because you plan to run multithreaded code interactively) then you must explicitly request that: e.g., <pre> qsub -l walltime=24:00:00 -l nodes=1:ppn=48 -I -q smp </pre> will reserve all 48 cores of one of the [[SMP machines]] for you for a day. === Multiple nodes === In the unlikely situation in which you want to run interactively on several nodes at once, you will only be given one shell. You can use the <tt>pbsdsh</tt> command to run the same commands on each of the processors you have been allocated, or /usr/local/bin/mpiexec to run [[MPI]] jobs. <pre> qsub -l walltime=00:01:00 -l nodes=2:ppn=2 -I -q smp qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@smp2 ~]$ pbsdsh hostname smp2 smp1 smp1 smp2 </pre> === X forwarding === If you want to run jobs that use the X windows system, you may turn on X11 forwarding using the <tt>-X</tt> option. (This is like the <tt>-X</tt> option to <tt>ssh</tt>.) === Walltime requests === Please be considerate in your walltime requests. While you have an interactive job running, no other user can use the resources you have allocated. Therefore, you should not set an interactive job running for a week and walk away. Jobs that appear to be reserving resources without using them will be terminated. Obviously the best strategy is to select a walltime that roughly matches what you want and exit before the time is up. 033b32d8a8078fd1e0db4990ca9631372bedbfe4 Parallelization 0 14 213 46 2011-04-23T07:42:41Z Mjh 2 wikitext text/x-wiki It is very important to realise that running a job on the cluster does not automatically give you access to all the CPU resources of the cluster (or even of the subset of the cluster you request from the [[jobs|job control system]]). There is one, and only one, situation in which you can run normal, single-threaded code on the cluster and get an improvement in performance. That is the situation in which ''you can usefully run multiple copies of the same program simultaneously without any communication between the different copies''; in other words, you are only limited by how many CPUs you can get access to. Examples might include Monte Carlo simulation or some data analysis tasks. Problems of this kind are called ''embarrassingly parallel''. Provided that your code is thread-safe &mdash; that is, it doesn't have implementation errors that prevent it being run several times simultaneously, such as use of temporary files with the same name &mdash; you can use the cluster for this sort of problem without modifying your code. You may be able to use the [[job control system|jobs]] with commands such as <tt>pbsdsh</tt>, or you may need to run an [[interactive jobs|interactive job]]. Once you need the different parts of your code to communicate, you in general ''must'' use code that is written specifically to do so. At present [[MPI]] is the method provided to do this. If you are planning to run an application written by a third party, it needs to use [[MPI]] if you expect to be able to run on more than one node. If you are running your own code, you will need to modify it, often very substantially, to allow parallel execution. '''This is your responsibility, not that of the cluster administrators.''' 961028ae7e5bf4b8c451f117b5317b658293a40b Parallelization 0 14 214 213 2011-04-23T07:43:09Z Mjh 2 wikitext text/x-wiki It is very important to realise that running a job on the cluster does not automatically give you access to all the CPU resources of the cluster (or even of the subset of the cluster you request from the [[jobs|job control system]]). There is one, and only one, situation in which you can run normal, single-threaded code on the cluster and get an improvement in performance. That is the situation in which ''you can usefully run multiple copies of the same program simultaneously without any communication between the different copies''; in other words, you are only limited by how many CPUs you can get access to. Examples might include Monte Carlo simulation or some data analysis tasks. Problems of this kind are called ''embarrassingly parallel''. Provided that your code is thread-safe &mdash; that is, it doesn't have implementation errors that prevent it being run several times simultaneously, such as use of temporary files with the same name &mdash; you can use the cluster for this sort of problem without modifying your code. You may be able to use the [[jobs|job control system]] with commands such as <tt>pbsdsh</tt>, or you may need to run an [[interactive jobs|interactive job]]. Once you need the different parts of your code to communicate, you in general ''must'' use code that is written specifically to do so. At present [[MPI]] is the method provided to do this. If you are planning to run an application written by a third party, it needs to use [[MPI]] if you expect to be able to run on more than one node. If you are running your own code, you will need to modify it, often very substantially, to allow parallel execution. '''This is your responsibility, not that of the cluster administrators.''' 08f0125bc8db50e23b1c4e16b7188ddb9db366ca Interactive jobs 0 35 215 209 2011-04-23T07:48:09Z Mjh 2 wikitext text/x-wiki Interactive jobs are jobs which give you an interactive session on one of the compute nodes. Importantly, accessing the compute nodes this way means that the job control system guarantees the resources that you have asked for. If you simply log into a compute node with ssh (which is in any case strongly discouraged by the cluster access [[policies]]), another user's job may compete with what you are trying to do. Therefore you should, if possible, always use the interactive job facility to run interactively on the compute nodes. == Running an interactive job == An interactive job is run using the <tt>qsub</tt> command as for normal [[jobs]]. However, the result of running an interactive job is that you are logged on to the target machine. For example, <pre> [user@stri-cluster ~]$ qsub -l walltime=00:30:00 -I -q main qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@node047 ~]$ </pre> In this example the user has requested a 30-minute session on any node in the main cluster (the default request is one processor on one node). A node is available to run the job and so the user finds herself at a shell prompt. She may now use the node interactively for 30 minutes. At the end of the 30 minutes (wall time, not CPU time!) the job will end and the session will be terminated. If the user logs out before then, the job will terminate early. Note that, just as with ordinary jobs, you must honour the allocation you are given. If you ask for one CPU on one node, you must only use that. In the interactive shell, a number of environment variables beginning with PBS (try <tt>printenv | grep PBS</tt>) are provided to tell you the job environment in which you are running in case you have forgotten. If your request for an interactive shell cannot be fulfilled (because there are insufficient nodes available in the queue that you have specified), the qsub command will wait until it can be. == Advanced topics == === Multiple CPUs === If you know you want a machine completely dedicated to you (e.g. because you plan to run multithreaded code interactively) then you must explicitly request that: e.g., <pre> qsub -l walltime=24:00:00 -l nodes=1:ppn=48 -I -q smp </pre> will reserve all 48 cores of one of the [[SMP machines]] for you for a day. === Multiple nodes === In the unlikely situation in which you want to run interactively on several nodes at once, you will only be given one shell. You can use the <tt>pbsdsh</tt> command to run the same commands on each of the processors you have been allocated, or /usr/local/bin/mpiexec to run [[MPI]] jobs. <pre> qsub -l walltime=00:01:00 -l nodes=2:ppn=2 -I -q smp qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@smp2 ~]$ pbsdsh hostname smp2 smp1 smp1 smp2 </pre> === Specific machines === It is possible to request a specific machine just as for normal non-interactive [[jobs]]: <pre> qsub -l walltime=00:01:00 -l nodes=smp2:ppn=48 -I -q smp </pre> Do this if you know that you need to use some particular feature of the machine, such as the [[SMP machines]]' scratch discs. === X forwarding === If you want to run jobs that use the X windows system, you may turn on X11 forwarding using the <tt>-X</tt> option. (This is like the <tt>-X</tt> option to <tt>ssh</tt>.) === Walltime requests === Please be considerate in your walltime requests. While you have an interactive job running, no other user can use the resources you have allocated. Therefore, you should not set an interactive job running for a week and walk away. Jobs that appear to be reserving resources without using them will be terminated. Obviously the best strategy is to select a walltime that roughly matches what you want and exit before the time is up. a96afda426acd8573deb7c6dab464a78029ad7b4 SMP machines 0 24 216 159 2011-04-23T07:49:01Z Mjh 2 wikitext text/x-wiki The SMP machines are two 4-processor, 48-core systems each with 256 Gb of RAM. The individual cores in these machines are slightly slower (~ 20% on CPU-bound floating-point operations) than the Intel CPUs of the main computing nodes. So users are recommended to use the main part of the cluster for straightforward computing requirements that do not need the special features of the SMP machines. The big advantage of the SMP machines is the large amount of physical memory visible to all 48 cores. This allows for multi-threaded, shared-memory applications. The SMP machines also each have 10 Tb of local scratch disc space (mounted as /scratch). Jobs may be started on the SMP machines using the <tt>smp</tt> [[queues|queue]] in Torque. e7b6d5eaf2f9b871b1975e7e8de2a18ed237d3a9 Architecture 0 7 217 198 2011-04-23T07:50:33Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * Ethernet and infiniband switches to provide connectivity. The nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. http://stri-cluster.herts.ac.uk/cluster.jpg f24b3509a63a9e0bcb02990e466a2e71c3278f73 219 217 2011-04-23T07:55:43Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 80 compute nodes (or just 'nodes'), of which ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form the CAIR cluster (chassis 4 and 5) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: so there are 5 chassis in total, 3 in the main cluster and 2 in the CAIR cluster. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the nodes and storage, but does not reflect the current physical configuration of the cluster. http://stri-cluster.herts.ac.uk/cluster.jpg 0cd085a8760619e9e9c94839acd70f990677153d 228 219 2011-05-26T17:18:42Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 96 compute nodes (or just 'nodes'), of which ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster and the CAR cluster (chassis 6) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 6 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows some of the nodes and storage, but does not reflect the current physical configuration of the cluster. http://stri-cluster.herts.ac.uk/cluster.jpg 5903edbf214aac09aa217da9f2f5786e1829ddf0 244 228 2011-09-06T12:08:29Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 96 compute nodes (or just 'nodes'), of which ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the CAR cluster (chassis 7) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 7 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows some of the nodes and storage, but does not reflect the current physical configuration of the cluster. http://stri-cluster.herts.ac.uk/cluster.jpg 849a9336f5dddf6ec5bc692e68f45008ad73f8dc 248 244 2011-09-21T16:54:39Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 96 compute nodes (or just 'nodes'), of which ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the CAR cluster (chassis 7) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development) * 110 Tb of [[storage]] attached via Fibre Channel to the head node * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 7 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the current physical layout of the cluster components. http://stri-cluster.herts.ac.uk/cluster2.jpg 8f2fb0bf2dee6c4cc8e9d49a05ba53bf48c9059e Networking 0 10 218 143 2011-04-23T07:53:37Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. The links connecting the head node and the SMP machines, and the uplinks from the two node internal ethernet stacks, are carried by multiple physical cables via link aggregation. A seperate physical ethernet network carries management traffic. There are in fact two infiniband networks: one for the main cluster, which is mostly dual data-rate (nominal 2.5 Gbit/s) and one for the CAIR cluster, which is quad data-rate (5.0 Gbit/s). As for the ethernet, each chassis has an internal Infiniband switch. The two sub-clusters each have an external Infiniband switch which connects the chassis that make up the cluster and the head node is connected, separately, to both of those switches. Therefore there is no native infiniband connectivity between the two sub-clusters. IP over Infiniband can be used to connect between them, but this should not be relied on for large data volumes as it must be routed via the head node. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency and data transfer rates are somewhat higher between nodes in the same chassis than between different chassis in the same cluster, and ethernet connections between the two sub-clusters are higher-latency and lower-bandwidth still, while native Infiniband connections between the two sub-clusters are essentially impossible. Best results will be obtained running jobs within a single chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx or 4.xxx, where the 3 and 4 refer to the main and CAIR clusters respectively). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The two SMP machines are attached to the Infiniband network of the main cluster and are called smp1 and smp2: thus they have addresses smp1.data, smp1.infi etc. 7e2292b696771b22f430dd85f5e06b8c6ff4ab55 245 218 2011-09-06T12:09:44Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. The links connecting the head node and the SMP machines, and the uplinks from the two node internal ethernet stacks, are carried by multiple physical cables via link aggregation. A seperate physical ethernet network carries management traffic. There are in fact two infiniband networks: one for the main and CAR clusters, which is mostly dual data-rate (nominal 2.5 Gbit/s) and one for the CAIR cluster, which is quad data-rate (5.0 Gbit/s). As for the ethernet, each chassis has an internal Infiniband switch. The two sub-clusters each have an external Infiniband switch which connects the chassis that make up the cluster and the head node is connected, separately, to both of those switches. Therefore there is no native infiniband connectivity between the two sub-clusters. IP over Infiniband can be used to connect between them, but this should not be relied on for large data volumes as it must be routed via the head node. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency and data transfer rates are somewhat higher between nodes in the same chassis than between different chassis in the same cluster, and ethernet connections between the two sub-clusters are higher-latency and lower-bandwidth still, while native Infiniband connections between the two sub-clusters are essentially impossible. Best results will be obtained running jobs within a single chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx or 4.xxx, where the 3 and 4 refer to the main/CAR and CAIR clusters respectively). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The two SMP machines are attached to the Infiniband network of the main/CAR cluster and are called smp1 and smp2: thus they have addresses smp1.data, smp1.infi etc. 5ca2e68e05a6649aa940155ccc18bc4135d2094b MPI 0 12 220 194 2011-04-23T08:24:22Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Ethernet network, and so the latency and total available bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. For documentation of <tt>mpiexec</tt>, see [http://www.osc.edu/~djohnson/mpiexec/index.php]. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). If for some reason you want to alter this default behaviour (e.g. if your MPI code needs to run multi-threaded, as would be the case if you were mixing MPI and OpenMP), then <tt>mpiexec</tt> has options to allow this: run <tt>/usr/local/bin/mpiexec</tt> on the head node with no arguments to get a list of possible options. === MPICH2 (local versions) === Locally compiled, more up-to-date versions of MPICH2 are available. To use these use <tt>modules</tt> commands: do <pre> module unload mpich2-x86_64 module load mpich2-local OR module load mpich2-intel </pre> Then <pre> which mpicc /soft/mpich2/bin/mpicc </pre> If you wish to use these permanently, then you are recommended to put these module commands in your .cshrc or .bashrc. Jobs compiled this way should also be run with /usr/local/bin/mpiexec. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. To use MVAPICH2 do <pre> module unload mpich2-x86_64 module load mvapich2 </pre> Then you should see <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected via the [[modules]] system. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 22fd4dbffdd83b7f2f39407e40bbee2593b901d9 Jobs 0 9 221 210 2011-04-28T21:23:20Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). (Note that in the default <tt>qstat</tt> view the CPU time is not displayed correctly.) The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. 808512aa9bac838e38a1b349548f68de440d3073 236 221 2011-07-17T11:50:09Z Mjh 2 /* Basic commands */ wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. 929b87fb6a1d3da1e22410554df71c0d6e9d8c4c 239 236 2011-08-17T13:26:01Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -o: specify where standard output/error should be stored, if not /home/user * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints. Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 54af1e5872ae2461700434b1a4c3b3a0e75310ec 242 239 2011-08-17T13:38:50Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be preserved (by default in your home directory). * -j: specify whether the output and error streams should be kept separate or merged. * -o: specify where standard output/error should be stored, if not /home/user * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints. Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 7d503ccbcc0d8e1815ca5e958c257a0172076df1 250 242 2011-10-18T15:55:43Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints. Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 0b4a8c5c674c96f95ae6965d6f5dc53e1d4a86a0 Memory 0 36 222 2011-04-28T21:40:11Z Mjh 2 Created page with 'Jobs that will use a large amount of physical memory must make sure that they request this in the job submission command or script. As described in the section on [[architecture…' wikitext text/x-wiki Jobs that will use a large amount of physical memory must make sure that they request this in the job submission command or script. As described in the section on [[architecture]], the nodes of the main cluster have 24 Gb of physical memory. If the total amount of memory used by all jobs running on the nodes is larger than, or even starts to approach this value, than performance will drop, jobs may be unable to allocate memory and will eventually be killed, and the nodes may even crash (as the Linux out-of-memory killer appears often to leave a node in an unstable state). To try to avoid this situation arising, jobs submitted to the main [[queue]] are limited to request, by default, no more than 1 Gb of physical memory per process. If your job exceeds this, it will be automatically killed. If you need more than this, you must request it explicitly using the <tt>pmem</tt> attribute in the job control system. <tt>pmem</tt> sets the amount of physical memory, per process, that your jobs will use. So, for example, if you need 8 Gb of memory per process, an example job submission script would look like this: <pre> #!/bin/sh -f #PBS -N large-job #PBS -m abe #PBS -l nodes=8 #PBS -l walltime=00:01:00 #PBS -l pmem=8gb #PBS -k oe ... job commands go here ... </pre> This job will attempt to run 8 separate processes, but the scheduling system will not (as it would by default) place them all on the same node, since it knows that the processes would require more memory than is available. They will be placed on separate nodes where their memory requirements will not interfere with each other. It's the user's responsibility to figure out a sensible memory request for large jobs. Err on the side of generosity, but bear in mind that asking for too much memory will make the scheduling of both your and other users' jobs inefficient. 1ea223c88c0636ec48dc80f207b679db7eb5b07e 223 222 2011-04-28T21:42:15Z Mjh 2 wikitext text/x-wiki Jobs that will use a large amount of physical memory must make sure that they request this in the job submission command or script. As described in the section on [[architecture]], the nodes of the main cluster have 24 Gb of physical memory. If the total amount of memory used by all jobs running on the nodes is larger than, or even starts to approach this value, than performance will drop, jobs may be unable to allocate memory and will eventually be killed, and the nodes may even crash (as the Linux out-of-memory killer appears often to leave a node in an unstable state). To try to avoid this situation arising, jobs submitted to the main [[queue]] are limited to request, by default, no more than 1 Gb of physical memory per process. If your job exceeds this, it will be automatically killed. If you need more than this, you must request it explicitly using the <tt>pmem</tt> attribute in the job control system. <tt>pmem</tt> sets the amount of physical memory, per process, that your jobs will use. So, for example, if you need 8 Gb of memory per process, an example job submission script would look like this: <pre> #!/bin/sh -f #PBS -N large-job #PBS -m abe #PBS -l nodes=8 #PBS -l walltime=00:01:00 #PBS -l pmem=8gb #PBS -k oe ... job commands go here ... </pre> This job will attempt to run 8 separate processes, but the scheduling system will not (as it would by default) place them all on the same node, since it knows that the processes would require more memory than is available. They will be placed on separate nodes where their memory requirements will not interfere with each other. It's the user's responsibility to figure out a sensible memory request for large jobs. Err on the side of generosity, but bear in mind that asking for too much memory will make the scheduling of both your and other users' jobs inefficient. Users needing very large amounts of memory should consider using the [[SMP machines]], which have 256 Gb of physical memory. (If doing this, it is still sensible to use the <tt>pmem</tt> job attribute to make sure that you will not interfere with other users.) 783612000f84e816b06810ea94ef821e37b5f125 224 223 2011-04-28T21:42:44Z Mjh 2 wikitext text/x-wiki Jobs that will use a large amount of physical memory must make sure that they request this in the job submission command or script. As described in the section on [[architecture]], the nodes of the main cluster have 24 Gb of physical memory. If the total amount of memory used by all jobs running on the nodes is larger than, or even starts to approach this value, than performance will drop, jobs may be unable to allocate memory and will eventually be killed, and the nodes may even crash (as the Linux out-of-memory killer appears often to leave a node in an unstable state). To try to avoid this situation arising, jobs submitted to the main [[queues|queue]] are limited to request, by default, no more than 1 Gb of physical memory per process. If your job exceeds this, it will be automatically killed. If you need more than this, you must request it explicitly using the <tt>pmem</tt> attribute in the job control system. <tt>pmem</tt> sets the amount of physical memory, per process, that your jobs will use. So, for example, if you need 8 Gb of memory per process, an example job submission script would look like this: <pre> #!/bin/sh -f #PBS -N large-job #PBS -m abe #PBS -l nodes=8 #PBS -l walltime=00:01:00 #PBS -l pmem=8gb #PBS -k oe ... job commands go here ... </pre> This job will attempt to run 8 separate processes, but the scheduling system will not (as it would by default) place them all on the same node, since it knows that the processes would require more memory than is available. They will be placed on separate nodes where their memory requirements will not interfere with each other. It's the user's responsibility to figure out a sensible memory request for large jobs. Err on the side of generosity, but bear in mind that asking for too much memory will make the scheduling of both your and other users' jobs inefficient. Users needing very large amounts of memory should consider using the [[SMP machines]], which have 256 Gb of physical memory. (If doing this, it is still sensible to use the <tt>pmem</tt> job attribute to make sure that you will not interfere with other users.) 279930f0708ee0d291aae663f00311c69b92b9b9 225 224 2011-04-28T21:56:40Z Mjh 2 wikitext text/x-wiki Jobs that will use a large amount of physical memory must make sure that they request this in the job submission command or script. As described in the section on [[architecture]], the nodes of the main cluster have 24 Gb of physical memory. If the total amount of memory used by all jobs running on the nodes is larger than, or even starts to approach this value, than performance will drop, jobs may be unable to allocate memory and will eventually be killed, and the nodes may even crash (as the Linux out-of-memory killer appears often to leave a node in an unstable state). To try to avoid this situation arising, jobs submitted to the main [[queues|queue]] are limited to request, by default, no more than 1 Gb of physical memory per process. If your job exceeds this, it will be automatically killed. If you need more than this, you must request it explicitly using the <tt>pmem</tt> attribute in the job control system. <tt>pmem</tt> sets the amount of physical memory, per process, that your jobs will use. So, for example, if you need 8 Gb of memory per process, an example job submission script would look like this: <pre> #!/bin/sh -f #PBS -N large-job #PBS -m abe #PBS -l nodes=8 #PBS -l walltime=00:01:00 #PBS -l pmem=8gb #PBS -k oe ... job commands go here ... </pre> This job will attempt to run 8 separate processes, but the scheduling system will not (as it would by default) place them all on the same node, since it knows that the processes would require more memory than is available. They will be placed on separate nodes where their memory requirements will not interfere with each other. It's the user's responsibility to figure out a sensible memory request for large jobs. Err on the side of generosity, but bear in mind that asking for too much memory will make the scheduling of both your and other users' jobs inefficient. Please bear in mind that the typical cluster job runs very comfortably in 1 Gb. You can see how much physical memory a running job is using by doing <tt>qstat -f <jobid></tt>: the line <tt>resources_used.mem</tt> tells you the total memory use for all processes. Users needing very large amounts of memory should consider using the [[SMP machines]], which have 256 Gb of physical memory. (If doing this, it is still sensible to use the <tt>pmem</tt> job attribute to make sure that you will not interfere with other users.) 65a00886a896fb6aee2859724550de0b9837e95d Queues 0 15 226 187 2011-04-28T22:03:54Z Mjh 2 wikitext text/x-wiki There are five possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to the 32 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' is a special queue that submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 02deecebabea649e94d19cfc24fab7e2fc930448 229 226 2011-05-26T17:21:22Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to the 46 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 4 dedicated CAR machines. This queue is restricted to CAR machines. It is not currently possible to run Infiniband jobs on the CAR machines. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. c82a5b69ba7b5a3f7ca3d13feeff83009bb0e229 230 229 2011-05-26T17:21:35Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to the 46 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 4 dedicated CAR machines. This queue is restricted to CAR users. It is not currently possible to run Infiniband jobs on the CAR machines. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 706e8e173e2730666a83e30ee8ae184049ccc26c 241 230 2011-08-17T13:32:03Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to the 46 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 4 dedicated CAR machines. This queue is restricted to CAR users. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 1330d451091183ba4337e23402f93e7fab9ceb40 243 241 2011-09-06T12:07:08Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to the 46 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 16 dedicated CAR machines. This queue is restricted to CAR users. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 2b5cc44b74f917dd53f77978494ccfb71d72a9c0 Policies 0 4 227 207 2011-05-20T11:35:07Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing or processing of results from the compute nodes. * The only permitted method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly using ssh. If you need an interactive session on a compute node, e.g. for data processing, use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * There is a fair-share policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that if you have been making heavy use of the cluster on a timescale of days (to be exact, a weighted integral over the past week is taken) future jobs will be given a lower priority; you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on this timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. a06f9b8e42b458f58054a068fc477e830b2157ce 261 227 2012-01-14T09:07:03Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing or processing of results from the compute nodes. * The only permitted method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly using ssh. If you need an interactive session on a compute node, e.g. for data processing, use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * There is a fair-share policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that if you have been making heavy use of the cluster on a timescale of days (to be exact, a weighted integral over the past week is taken) future jobs will be given a lower priority; you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on this timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. * If you are one of the group of people with exclusive or privileged use of a group of nodes, you must use those nodes in preference to the nodes of the main cluster if possible. You should not run jobs in the 'main' queue unless no nodes in your reserved group are available. This applies to members of CAIR and CAR. 901124238308950bef7f3cabfff1acbb9ae40c13 Known problems 0 25 231 211 2011-06-09T14:17:56Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency and bandwidth is lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * QDR infiniband uplinks between chassis4/5 and the infiniband switch in that rack do not work at the expected speed. * I/O intensive operations on the data or home directories can cause nodes to lock up in some situations. This only happens occasionally, but is clearly associated with NFS activity. 481b46c059f96225b42113ba129bba251116e785 240 231 2011-08-17T13:30:39Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * QDR infiniband uplinks between chassis4/5/6 and the infiniband switch in that rack do not work at the expected speed. This affects the speed of jobs in the CAIR nodes that attempt to run over more than one chassis. * I/O intensive operations on the data or home directories can cause nodes to lock up in some situations. This only happens occasionally, but is clearly associated with NFS activity. 4e7ec76a96bf370b80354369386f005b8ade078b 246 240 2011-09-12T07:19:33Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * I/O intensive operations on the data or home directories can cause nodes to lock up in some situations. This only happens occasionally, but is clearly associated with NFS activity. 008f8e20b6dd724b3da7accf8b42ceb79a2b08dd 252 246 2011-10-30T09:31:35Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * I/O intensive operations on the data or home directories can cause nodes to lock up in some situations. This only happens occasionally, but is clearly associated with NFS activity. We know of several workarounds so if your job regularly crashes nodes please discuss it with us. 4113badb4817ceff38fcdbef73c1743afe15b028 Cluster bibliography 0 30 232 190 2011-06-23T08:32:02Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Kukol, A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', doi:10.1016/j.ejmech.2011.05.026 * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, accepted by ''MNRAS'', '''2011''' Mar * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 15f7efb58265e1c079c09c5c4a9c8c4ea70c0f0d 235 232 2011-07-11T18:39:18Z Mjh 2 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Kukol, A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', doi:10.1016/j.ejmech.2011.05.026 * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, accepted by ''MNRAS'', '''2011''' Mar * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 bbfa0de766db6a3593f7c7ebd1a586ca5be78f82 237 235 2011-07-29T08:34:45Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Kalia M, Kukol, A, Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', '''2011''', doi:10.1016/j.jsb.2011.07.012 * Kukol A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', doi:10.1016/j.ejmech.2011.05.026 * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, accepted by ''MNRAS'', '''2011''' Mar * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 d1dcd0876e8c8a5404c811e2352a56e024daff67 238 237 2011-07-29T08:35:08Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Kalia M, Kukol A, Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', '''2011''', doi:10.1016/j.jsb.2011.07.012 * Kukol A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', doi:10.1016/j.ejmech.2011.05.026 * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, accepted by ''MNRAS'', '''2011''' Mar * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 43ec64f6e55489860a0f45f467b48bf67cbd8c59 249 238 2011-10-12T13:42:56Z Mjh 2 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Kalia M, Kukol A, Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', '''2011''', doi:10.1016/j.jsb.2011.07.012 * Kukol A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', doi:10.1016/j.ejmech.2011.05.026 * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', MNRAS 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 41941558c32d756aa57368a2fea03906b96a4828 Main Page 0 1 233 191 2011-07-08T08:49:44Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]] * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] b10a5ab89e22cce2e2bb80bd2b8ac02afec47b07 257 233 2012-01-07T09:35:06Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] 378c8f2c083a95fe35c77ff2778916b1ad7750ee 262 257 2012-02-15T15:52:27Z Mjh 2 /* Cluster basics */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] f7e7f2112da99fc812ef64eb982327d8164a3db6 Why doesn't my job run? 0 37 234 2011-07-08T09:32:42Z Mjh 2 Created page with 'If you submit a [[jobs|job]] and it stays in the queue, it may not be clear why nothing is happening. Obviously one possibility is that there is a problem with the cluster: but i…' wikitext text/x-wiki If you submit a [[jobs|job]] and it stays in the queue, it may not be clear why nothing is happening. Obviously one possibility is that there is a problem with the cluster: but it is also possible that things are going on that you are not seeing. To ask the scheduler what is going on with a job you can use the command <tt>/usr/local/maui/bin/checkjob</tt>. A typical <tt>checkjob</tt> run might look like this (note the <tt>-v</tt> option): <pre> /usr/local/maui/bin/checkjob -v 123456 checking job 123456 (RM job '123456.stri-cluster.herts.ac.uk') State: Idle Creds: user:fred group:fred class:main qos:DEFAULT WallTime: 00:00:00 of 7:00:00:00 SubmitTime: Fri Jul 8 09:04:48 (Time Queued Total: 00:38:52 Eligible: 00:38:52) Total Tasks: 24 Req[0] TaskCount: 24 Partition: ALL Network: [NONE] Memory >= 1024M Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [main] Exec: '' ExecSize: 0 ImageSize: 0 Dedicated Resources Per Task: PROCS: 1 MEM: 1024M NodeAccess: SHARED TasksPerNode: 8 NodeCount: 3 IWD: [NONE] Executable: [NONE] Bypass: 63 StartCount: 0 PartitionMask: [ALL] Flags: RESTARTABLE PE: 24.00 StartPriority: 2513 job cannot run in partition DEFAULT (idle procs do not meet requirements : 0 of 24 procs found) idle procs: 732 feasible procs: 0 Rejection Reasons: [Features : 58][CPU : 22][State : 18][ReserveTime : 8] Detailed Node Availability Information: node001 rejected : ReserveTime node002 rejected : ReserveTime node003 rejected : ReserveTime node004 rejected : State node005 rejected : ReserveTime node006 rejected : ReserveTime node007 rejected : ReserveTime node008 rejected : ReserveTime node009 rejected : ReserveTime node010 rejected : CPU node011 rejected : CPU node012 rejected : CPU node013 rejected : State node014 rejected : CPU node015 rejected : CPU node016 rejected : CPU node017 rejected : State node018 rejected : State node019 rejected : State node020 rejected : State node021 rejected : State node022 rejected : State node023 rejected : State node024 rejected : State node025 rejected : State node026 rejected : State node027 rejected : State node028 rejected : State node029 rejected : State node030 rejected : State node031 rejected : State node032 rejected : CPU node033 rejected : CPU node034 rejected : CPU node035 rejected : CPU node036 rejected : CPU node037 rejected : CPU node038 rejected : CPU node039 rejected : CPU node040 rejected : CPU node041 rejected : State node042 rejected : CPU node043 rejected : CPU node044 rejected : CPU node045 rejected : CPU node046 rejected : CPU node047 rejected : CPU node048 rejected : CPU node049 rejected : Features node050 rejected : Features node051 rejected : Features node052 rejected : Features node053 rejected : Features node054 rejected : Features node055 rejected : Features node056 rejected : Features node057 rejected : Features node058 rejected : Features node059 rejected : Features node060 rejected : Features node061 rejected : Features node062 rejected : Features node063 rejected : Features node064 rejected : Features node065 rejected : Features node066 rejected : Features node067 rejected : Features node068 rejected : Features node069 rejected : Features node070 rejected : Features node071 rejected : Features node072 rejected : Features node073 rejected : Features node074 rejected : Features node075 rejected : Features node076 rejected : Features node077 rejected : Features node078 rejected : Features node079 rejected : Features node080 rejected : Features sandbox1 rejected : Features sandbox2 rejected : Features sandbox3 rejected : Features sandbox4 rejected : Features sandbox5 rejected : Features sandbox6 rejected : Features sandbox7 rejected : Features sandbox8 rejected : Features sandbox9 rejected : Features sandbox10 rejected : Features node081 rejected : Features node082 rejected : Features node083 rejected : Features node084 rejected : Features node085 rejected : Features node086 rejected : Features node087 rejected : Features node088 rejected : Features node089 rejected : Features node090 rejected : Features node091 rejected : Features node092 rejected : Features node093 rejected : Features node094 rejected : Features node095 rejected : Features node096 rejected : Features job cannot run in partition SMP (insufficient idle procs available: 0 < 24) </pre> How do you interpret all this output? First of all, if your output does not look anything like this, for example if you see an error message about not being able to contact a server, then the scheduler is not running or there is some other problem. In that case, contact one of the [[administrators]]. Assuming your output looks like the above, first of all you should check that the details of your job agree with what you think you submitted. Check <tt>NodeCount</tt> and <tt>TasksPerNode</tt> and the <tt>WallTime</tt> request. Now go down to the reason why <tt>job cannot run in partition DEFAULT</tt>. Normally, this will be as above: this is telling you that there are not enough available CPUs for your job. But suppose you think the cluster is close to empty. Why is this? Look down the list of individual nodes to see why each one is rejecting your job. The example above shows the common reasons: * Features: this means that you have asked for a feature that the node in question doesn't have. For example, jobs submitted to the main cluster will never run on the [[sandbox]] machines. It is normal for many nodes to reject your job for this reason. * State: usually this means that the node in question is marked busy by the system, which means that it is running as many jobs as possible. In this example, there are a number of busy nodes. It can also mean that nodes are down, i.e. that there is a real problem. If many nodes reject your job because of 'State' but you think they cannot be busy, check <tt>pbsnodes -a</tt> to see if they are 'down' and report a problem if so. * CPU: the node is not busy, but it has less available CPU than you have requested. This is most likely because it is running one or more jobs. In the example above, the user has asked for 3 nodes with 8 CPUs each: the nodes rejecting the job because of CPU do not have 8 free CPUs. (If all nodes are rejecting your job for this reason, you should check whether you are asking for an impossible configuration. Jobs submitted to the main cluster asking for more than 8 CPUs will sit there forever waiting for a machine with a suitable number of cores to become available. Does your job really need 3 machines with 8 CPUs, or would it work with 24 CPUs distributed over the cluster? * ReserveTime: your job asks to run on the node at a time when it is reserved for another purpose. This could be due to another job ahead of you in the queue, or it could be a system reservation. If you see this message but there are no jobs ahead of you in the queue, check your e-mail for a recent e-mail from the adminstrators regarding scheduled downtime. If you are seeing all of these and you can convince yourself that the individual nodes are rejecting your jobs for valid reasons, then the only thing you can do is wait for the conditions that are blocking your job to clear. Only if you can't convince yourself of this should you contact the administrators. 0a4625ad8d20bbd43bcdbd5e461e4be5ce646794 260 234 2012-01-13T13:44:57Z Mjh 2 wikitext text/x-wiki If you submit a [[jobs|job]] and it stays in the queue, it may not be clear why nothing is happening. Obviously one possibility is that there is a problem with the cluster: but it is also possible that things are going on that you are not seeing. To ask the scheduler what is going on with a job you can use the command <tt>/usr/local/maui/bin/checkjob</tt>. A typical <tt>checkjob</tt> run might look like this (note the <tt>-v</tt> option): <pre> /usr/local/maui/bin/checkjob -v 123456 checking job 123456 (RM job '123456.stri-cluster.herts.ac.uk') State: Idle Creds: user:fred group:fred class:main qos:DEFAULT WallTime: 00:00:00 of 7:00:00:00 SubmitTime: Fri Jul 8 09:04:48 (Time Queued Total: 00:38:52 Eligible: 00:38:52) Total Tasks: 24 Req[0] TaskCount: 24 Partition: ALL Network: [NONE] Memory >= 1024M Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [main] Exec: '' ExecSize: 0 ImageSize: 0 Dedicated Resources Per Task: PROCS: 1 MEM: 1024M NodeAccess: SHARED TasksPerNode: 8 NodeCount: 3 IWD: [NONE] Executable: [NONE] Bypass: 63 StartCount: 0 PartitionMask: [ALL] Flags: RESTARTABLE PE: 24.00 StartPriority: 2513 job cannot run in partition DEFAULT (idle procs do not meet requirements : 0 of 24 procs found) idle procs: 732 feasible procs: 0 Rejection Reasons: [Features : 58][CPU : 22][State : 18][ReserveTime : 8] Detailed Node Availability Information: node001 rejected : ReserveTime node002 rejected : ReserveTime node003 rejected : ReserveTime node004 rejected : State node005 rejected : ReserveTime node006 rejected : ReserveTime node007 rejected : ReserveTime node008 rejected : ReserveTime node009 rejected : ReserveTime node010 rejected : CPU node011 rejected : CPU node012 rejected : CPU node013 rejected : State node014 rejected : CPU node015 rejected : CPU node016 rejected : CPU node017 rejected : State node018 rejected : State node019 rejected : State node020 rejected : State node021 rejected : State node022 rejected : State node023 rejected : State node024 rejected : State node025 rejected : State node026 rejected : State node027 rejected : State node028 rejected : State node029 rejected : State node030 rejected : State node031 rejected : State node032 rejected : CPU node033 rejected : CPU node034 rejected : CPU node035 rejected : CPU node036 rejected : CPU node037 rejected : CPU node038 rejected : CPU node039 rejected : CPU node040 rejected : CPU node041 rejected : State node042 rejected : CPU node043 rejected : CPU node044 rejected : CPU node045 rejected : CPU node046 rejected : CPU node047 rejected : CPU node048 rejected : CPU node049 rejected : Features node050 rejected : Features node051 rejected : Features node052 rejected : Features node053 rejected : Features node054 rejected : Features node055 rejected : Features node056 rejected : Features node057 rejected : Features node058 rejected : Features node059 rejected : Features node060 rejected : Features node061 rejected : Features node062 rejected : Features node063 rejected : Features node064 rejected : Features node065 rejected : Features node066 rejected : Features node067 rejected : Features node068 rejected : Features node069 rejected : Features node070 rejected : Features node071 rejected : Features node072 rejected : Features node073 rejected : Features node074 rejected : Features node075 rejected : Features node076 rejected : Features node077 rejected : Features node078 rejected : Features node079 rejected : Features node080 rejected : Features sandbox1 rejected : Features sandbox2 rejected : Features sandbox3 rejected : Features sandbox4 rejected : Features sandbox5 rejected : Features sandbox6 rejected : Features sandbox7 rejected : Features sandbox8 rejected : Features sandbox9 rejected : Features sandbox10 rejected : Features node081 rejected : Features node082 rejected : Features node083 rejected : Features node084 rejected : Features node085 rejected : Features node086 rejected : Features node087 rejected : Features node088 rejected : Features node089 rejected : Features node090 rejected : Features node091 rejected : Features node092 rejected : Features node093 rejected : Features node094 rejected : Features node095 rejected : Features node096 rejected : Features job cannot run in partition SMP (insufficient idle procs available: 0 < 24) </pre> How do you interpret all this output? First of all, if your output does not look anything like this, for example if you see an error message about not being able to contact a server, then the scheduler is not running or there is some other problem. In that case, contact one of the [[administrators]]. Assuming your output looks like the above, first of all you should check that the details of your job agree with what you think you submitted. Check <tt>NodeCount</tt> and <tt>TasksPerNode</tt> and the <tt>WallTime</tt> request. You may also want to check the output of <tt>qstat -f <jobid></tt>. Now go down to the reason why <tt>job cannot run in partition DEFAULT</tt>. Normally, this will be as above: this is telling you that there are not enough available CPUs for your job. But suppose you think the cluster is close to empty. Why is this? Look down the list of individual nodes to see why each one is rejecting your job. The example above shows the common reasons: * Features: this means that you have asked for a feature that the node in question doesn't have. For example, jobs submitted to the main cluster will never run on the [[sandbox]] machines. It is normal for many nodes to reject your job for this reason. * State: usually this means that the node in question is marked busy by the system, which means that it is running as many jobs as possible. In this example, there are a number of busy nodes. It can also mean that nodes are down, i.e. that there is a real problem. If many nodes reject your job because of 'State' but you think they cannot be busy, check <tt>pbsnodes -a</tt> to see if they are 'down' and report a problem if so. * CPU: the node is not busy, but it has less available CPU than you have requested. This is most likely because it is running one or more jobs. In the example above, the user has asked for 3 nodes with 8 CPUs each: the nodes rejecting the job because of CPU do not have 8 free CPUs. (If all nodes are rejecting your job for this reason, you should check whether you are asking for an impossible configuration. Jobs submitted to the main cluster asking for more than 8 CPUs will sit there forever waiting for a machine with a suitable number of cores to become available. Does your job really need 3 machines with 8 CPUs, or would it work with 24 CPUs distributed over the cluster? * ReserveTime: your job asks to run on the node at a time when it is reserved for another purpose. This could be due to another job ahead of you in the queue, or it could be a system reservation. If you see this message but there are no jobs ahead of you in the queue, check your e-mail for a recent e-mail from the adminstrators regarding scheduled downtime. If you are seeing all of these and you can convince yourself that the individual nodes are rejecting your jobs for valid reasons, then the only thing you can do is wait for the conditions that are blocking your job to clear. Only if you can't convince yourself of this should you contact the administrators. dacbf4610028be6089ab4d51e406f09764c2fb30 Mail 0 18 247 70 2011-09-12T07:25:33Z Mjh 2 wikitext text/x-wiki Various systems on the cluster will want to send you e-mail. By default this will be delivered to a local mailbox: you will need to read it with a command-line mail tool like <tt>mutt</tt> on the head node. You are advised to set up a <tt>.forward</tt> file which will send it to your normal inbox. To do this, simply create the file containing one line of text, the e-mail address to forward to: <pre> cat <<END >.forward f.bloggs@herts.ac.uk END </pre> Please don't allow your inbox on the cluster to fill up with large messages. If you do not follow these instructions, the administrators reserve the right to delete your mailbox and/or to set up a <tt>.forward</tt> file for you themselves. 309d1a91aff602965fbcd4c748ce0efb5c69945a 258 247 2012-01-07T09:35:58Z Mjh 2 wikitext text/x-wiki Various systems on the cluster will want to send you e-mail. By default this will be delivered to a local mailbox: you will need to read it with a command-line mail tool like <tt>mutt</tt> on the head node. You are advised to set up a <tt>.forward</tt> file which will send it to your normal inbox. To do this, simply create the file containing one line of text, the e-mail address to forward to: <pre> cd cat <<END >.forward f.bloggs@herts.ac.uk END </pre> Please don't allow your inbox on the cluster to fill up with large messages. If you do not follow these instructions, the administrators reserve the right to delete your mailbox and/or to set up a <tt>.forward</tt> file for you themselves. 8d4c30267ad498f7fd6481c3d4bdda5bd2899cb5 259 258 2012-01-07T09:36:16Z Mjh 2 wikitext text/x-wiki Various systems on the cluster will want to send you e-mail. By default this will be delivered to a local mailbox: you will need to read it with a command-line mail tool like <tt>mutt</tt> on the head node. You are advised to set up a <tt>.forward</tt> file in your home directory which will send mail to your normal inbox. To do this, simply create the file containing one line of text, the e-mail address to forward to: <pre> cd cat <<END >.forward f.bloggs@herts.ac.uk END </pre> Please don't allow your inbox on the cluster to fill up with large messages. If you do not follow these instructions, the administrators reserve the right to delete your mailbox and/or to set up a <tt>.forward</tt> file for you themselves. 8de3df4c1bd76ce2fa835b858d4128e930d2849d Access 0 5 251 15 2011-10-30T09:23:32Z Mjh 2 wikitext text/x-wiki == Access == The [[architecture|head node]] of the cluster is accessible from within the university by ssh to stri-cluster.herts.ac.uk, once you have an [[accounts|account]] set up. If you are working from a Unix desktop, you should be able to type <tt>ssh username@stri-cluster.herts.ac.uk</tt>. If you are using Windows, you will need a Windows ssh client: we recommend PuTTY[http://www.chiark.greenend.org.uk/~sgtatham/putty/]. Individual compute nodes must be accessed via [[interactive jobs]] run on the head node: see also the [[policies|policy]] relating to this. dedfe8080401399f50034fe70012b30ad52c6d9e Storage 0 8 253 153 2012-01-06T12:55:46Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 65 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). A [[quota]] system is in place on /home. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large astronomical datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . 4c3e076ea710b08679b64684e37de1642f5d9b96 Quota 0 38 254 2012-01-06T12:57:06Z Mjh 2 Created page with 'Use of space on /home is subject to a quota system. You have a fixed amount of space that you are allowed to use: if you exceed this, you will no longer be able to create files o…' wikitext text/x-wiki Use of space on /home is subject to a quota system. You have a fixed amount of space that you are allowed to use: if you exceed this, you will no longer be able to create files on /home. The current default quota for all users is 20 Gb. c2b78b679eff0883eb653c0e7a2d46465f133440 255 254 2012-01-07T09:33:34Z Mjh 2 wikitext text/x-wiki Use of space on <tt>/home</tt> is subject to a quota system. You have a fixed amount of space that you are allowed to use: if you exceed this, you will no longer be able to create files on <tt>/home</tt>. The current default quota for all users is 20 Gb. When you reach 19 Gb, you will be warned and given a period (1 week) in which your usage should be reduced below 19 Gb; if you fail to reduce usage in this period, or if your usage reaches 20 Gb, new file creation will be blocked. The quota is ''not'' an expected reasonable use for a cluster user. You should try to keep your use of <tt>/home</tt> as low as possible, and certainly lower than 10 Gb. If you believe you have a specific need to exceed this quota, please contact the cluster [[administrators]]. 7dc04054b1fa3505e8b6232bd7a1d0505284664b 256 255 2012-01-07T09:34:25Z Mjh 2 wikitext text/x-wiki Use of space on <tt>/home</tt> is subject to a quota system. You have a fixed amount of space that you are allowed to use: if you exceed this, you will no longer be able to create files on <tt>/home</tt>. The current default quota for all users is 20 Gb. When you reach 19 Gb, you will be warned and given a period (1 week) in which your usage should be reduced below 19 Gb; if you fail to reduce usage in this period, or if your usage reaches 20 Gb, new file creation will be blocked. The quota is ''not'' an expected reasonable use for a cluster user. You should try to keep your use of <tt>/home</tt> as low as possible, and certainly lower than 10 Gb. If you believe you have a specific need to exceed this quota, please contact the cluster [[administrators]]. There is no quota on the various data areas and these are the locations where it is appropriate to store large volumes of data. 03786c0ff9a15d66dff2c386673aeeac01310ef8 Fair share 0 39 263 2012-02-15T16:04:12Z Mjh 2 Created page with 'There are various systems in place to try to make sure that everyone gets a fair share of the cluster (particularly the main cluster) and that a mixture of long and short, large …' wikitext text/x-wiki There are various systems in place to try to make sure that everyone gets a fair share of the cluster (particularly the main cluster) and that a mixture of long and short, large and small jobs can be run. Jobs are run based on a priority assigned by the scheduler. The priority takes account of a number of factors: * Jobs that involve more CPUs intrinsically have higher priority. This is to stop single-CPU work pushing out large jobs. * Jobs that have been in the queue for longer have higher priority, up to a maximum of 64 idle jobs per user (after that, the jobs will wait in the queue but will not accrue priority) * Jobs from users who have not recently significantly used the cluster have priority over those from users who have. (This is called the fair-share mechanism: the 'fair share amount' is a weighted integral of usage over the last week, weighted towards more recent use.) In addition, * no user can have more than 320 processors' worth of jobs running at once. This is intended to stop a single user monopolizing the cluster * no user can have a processor-time product that exceeds 1 week x 128 nodes running at any given time. This is intended to stop large long jobs blocking shorter jobs. These policies apply to all queues but are tailored to the main queue where the competition is highest. If they cause you problems, please contact the [[administrators]]. We are happy to review policies to try to get the fairest result for everyone. 39a33db9485b808cbd49272ddf4bde7438cbedea Policies 0 4 264 261 2012-02-15T16:05:04Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing or processing of results from the compute nodes. * The only permitted method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly using ssh. If you need an interactive session on a compute node, e.g. for data processing, use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * There is a [[fair share]] policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on a timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. * If you are one of the group of people with exclusive or privileged use of a group of nodes, you must use those nodes in preference to the nodes of the main cluster if possible. You should not run jobs in the 'main' queue unless no nodes in your reserved group are available. This applies to members of CAIR and CAR. 011c6a44c426048bbaf7b0f910051871d4c6eda9 292 264 2013-01-29T12:43:20Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * Accounts are for use by the named user only. You must not allow anyone else to use your account. * The [[architecture|head node]] must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing or processing of results from the compute nodes. * The only permitted method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly using ssh. If you need an interactive session on a compute node, e.g. for data processing, use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * There is a [[fair share]] policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on a timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. * If you are one of the group of people with exclusive or privileged use of a group of nodes, you must use those nodes in preference to the nodes of the main cluster if possible. You should not run jobs in the 'main' queue unless no nodes in your reserved group are available. This applies to members of CAIR and CAR. bcb34eddf4440be2b962aaf3f20cdb699ed7e98e Software 0 17 265 197 2012-05-04T15:04:39Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in /soft/aips * <u>[[CASA]]</u>: 3.1.0 installed in /soft/casapy-31.0.13530-002-64b * <u>[[IDL]]</u>: in /soft/idl/idl/bin * [[Matlab]]: in <tt>/soft/matlab/R2010a/bin<tt> 28b5a029cec577fa8940d0fa3003423c84430061 266 265 2012-05-04T15:04:55Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in /soft/aips * <u>[[CASA]]</u>: 3.1.0 installed in /soft/casapy-31.0.13530-002-64b * <u>[[IDL]]</u>: in /soft/idl/idl/bin * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin<tt> f77f431e3f07d4eff4cb2e8949abc7b689aad11e 320 266 2013-05-13T10:23:18Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: 3.1.0 installed in <tt>/soft/casapy-31.0.13530-002-64b</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> 9a0b79f034d89e1751b4b082658459f22af63a2b AIPS 0 27 267 175 2012-05-04T15:07:12Z Mjh 2 wikitext text/x-wiki AIPS is software for radio astronomy data reduction. It is installed on the cluster in /soft/aips . To use aips you will need to be in the aipsuser group. From the head node, use an [[interactive jobs|interactive job]] to get to the machine you want to use. AIPS is set up on some of the nodes, but most people use the [[SMP machines]]. Be sure to use the -X option to get X11 forwarding. Then do <tt>/soft/aips/START_AIPS tv=local</tt>. Disc 1 will be a local disc. Optionally, do <tt>/soft/aips/START_AIPS tv=local da=stri-cluster</tt> to get access to the cluster data area -- but you are recommended not to try to use this for data reduction. 9a5816cb5a22a1b79b7b8bfe8da3ff514d2b6e38 CASA 0 28 268 176 2012-05-04T15:08:24Z Mjh 2 wikitext text/x-wiki CASA is software for radio astronomy data reduction. It is installed on the cluster at casapy-stable-34.0.17353-001-64b . To use casa, do <tt>setenv PATH /soft/casapy-stable-34.0.17353-001-64b:$PATH</tt> and then run it with <tt>casapy</tt>. You should not run CASA on the head node: either run it through the batch job system or use an [[interactive jobs|interactive job]]. d5ef8ed73ef8185c0ed5452ab3eeccec18d84c3d Interactive jobs 0 35 269 215 2012-05-04T15:10:25Z Mjh 2 wikitext text/x-wiki Interactive jobs are jobs which give you an interactive session on one of the compute nodes. Importantly, accessing the compute nodes this way means that the job control system guarantees the resources that you have asked for. If you simply log into a compute node with ssh (which is in any case forbidden by the cluster access [[policies]]), another user's job may compete with what you are trying to do. Therefore you should, unless explicitly authorized otherwise, always use the interactive job facility to run interactively on the compute nodes. == Running an interactive job == An interactive job is run using the <tt>qsub</tt> command as for normal [[jobs]]. However, the result of running an interactive job is that you are logged on to the target machine. For example, <pre> [user@stri-cluster ~]$ qsub -l walltime=00:30:00 -I -q main qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@node047 ~]$ </pre> In this example the user has requested a 30-minute session on any node in the main cluster (the default request is one processor on one node). A node is available to run the job and so the user finds herself at a shell prompt. She may now use the node interactively for 30 minutes. At the end of the 30 minutes (wall time, not CPU time!) the job will end and the session will be terminated. If the user logs out before then, the job will terminate early. Note that, just as with ordinary jobs, you must honour the allocation you are given. If you ask for one CPU on one node, you must only use that. In the interactive shell, a number of environment variables beginning with PBS (try <tt>printenv | grep PBS</tt>) are provided to tell you the job environment in which you are running in case you have forgotten. If your request for an interactive shell cannot be fulfilled (because there are insufficient nodes available in the queue that you have specified), the qsub command will wait until it can be. == Advanced topics == === Multiple CPUs === If you know you want a machine completely dedicated to you (e.g. because you plan to run multithreaded code interactively) then you must explicitly request that: e.g., <pre> qsub -l walltime=24:00:00 -l nodes=1:ppn=48 -I -q smp </pre> will reserve all 48 cores of one of the [[SMP machines]] for you for a day. === Multiple nodes === In the unlikely situation in which you want to run interactively on several nodes at once, you will only be given one shell. You can use the <tt>pbsdsh</tt> command to run the same commands on each of the processors you have been allocated, or /usr/local/bin/mpiexec to run [[MPI]] jobs. <pre> qsub -l walltime=00:01:00 -l nodes=2:ppn=2 -I -q smp qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@smp2 ~]$ pbsdsh hostname smp2 smp1 smp1 smp2 </pre> === Specific machines === It is possible to request a specific machine just as for normal non-interactive [[jobs]]: <pre> qsub -l walltime=00:01:00 -l nodes=smp2:ppn=48 -I -q smp </pre> Do this if you know that you need to use some particular feature of the machine, such as the [[SMP machines]]' scratch discs. === X forwarding === If you want to run jobs that use the X windows system, you may turn on X11 forwarding using the <tt>-X</tt> option. (This is like the <tt>-X</tt> option to <tt>ssh</tt>.) === Walltime requests === Please be considerate in your walltime requests. While you have an interactive job running, no other user can use the resources you have allocated. Therefore, you should not set an interactive job running for a week and walk away. Jobs that appear to be reserving resources without using them will be terminated. Obviously the best strategy is to select a walltime that roughly matches what you want and exit before the time is up. 1e8d3294e31916a19138ea956e787bb1e7f16c17 Storage 0 8 270 253 2012-05-04T15:13:06Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 61 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data * 19 Tb of scratch for CAIR users only, mounted as /cair-data3 * 39 Tb of scratch for CAIR users only, mounted as /cair-data4 * 19 Tb of scratch for CAIR users only, mounted as /cair-data5 * 58 Tb of scratch for CAR users only, mounted as /car-data There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). A [[quota]] system is in place on /home. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . a439f5eb6ce2bc83843f13597785284514a7b8aa 277 270 2012-09-18T10:21:25Z Jonnya 9 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 61 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data * 19 Tb of scratch for CAIR users only, mounted as /cair-data3 * 39 Tb of scratch for CAIR users only, mounted as /cair-data4 * 19 Tb of scratch for CAIR users only, mounted as /cair-data5 * 58 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch dor CAIR users only, mounted as /dair-storage There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). A [[quota]] system is in place on /home. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . 8dd43a1eb9e34f58f182dbfd614e00dc0db97c87 278 277 2012-09-18T10:21:33Z Jonnya 9 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 61 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data * 19 Tb of scratch for CAIR users only, mounted as /cair-data3 * 39 Tb of scratch for CAIR users only, mounted as /cair-data4 * 19 Tb of scratch for CAIR users only, mounted as /cair-data5 * 58 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /dair-storage There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). A [[quota]] system is in place on /home. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . dcb9c555370a54c8aea5253a3aef63387ca5e527 281 278 2012-10-08T10:41:41Z Jonnya 9 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 61 Tb of scratch available to all users, mounted as /stri-data * 40 Tb of scratch for CAIR users only, mounted as /cair-data * 19 Tb of scratch for CAIR users only, mounted as /cair-data3 * 39 Tb of scratch for CAIR users only, mounted as /cair-data4 * 19 Tb of scratch for CAIR users only, mounted as /cair-data5 * 58 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-storage There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). A [[quota]] system is in place on /home. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . a6b5e3cd645d52a4298937abed7452ed0d62e7f3 Architecture 0 7 272 248 2012-07-25T07:26:01Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 124 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development) * 200 Tb of [[storage]] attached via Fibre Channel to the head node * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 8 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. http://stri-cluster.herts.ac.uk/cluster2.jpg b369c918022b53855a55d98fa66083a88318b24d 276 272 2012-09-18T10:20:31Z Jonnya 9 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 124 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 77 Tb of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development). * 200 Tb of [[storage]] attached via Fibre Channel to the head node. * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 8 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. http://stri-cluster.herts.ac.uk/cluster2.jpg 74d85317aa7714ce2e7e818d7ff9865def222a0a 291 276 2012-12-22T07:40:26Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 124 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 77 Tb of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development). * 200 Tb of [[storage]] attached via Fibre Channel to the head node. * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 8 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 66bb7257abf5e272ee0c9b0bb10880b702fcbe7b Queues 0 15 273 243 2012-07-25T07:27:10Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to the 46 CAIR nodes, with a maximum wall time of 6 hours. Users may have at most 2 jobs in this queue at a time. * 'cair_l' also submits to the CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. c940bf301186b982ca9d63ab2470f6b919e8a8df 274 273 2012-07-25T07:28:29Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 46 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 5f6933bb3074d7034c2e93277cfa4eacc31f293e 297 274 2013-02-19T12:31:41Z Jonnya 9 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 46 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 96 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 8e82aad01fc3ed7e5ab17264e64545363d2c8b2c Cair-cluster 0 40 275 2012-09-18T10:06:45Z Jonnya 9 Created page with '== Cair data processing server == There is now a dedicated file server for cair users. The hostname is <code>cair-cluster</code> which is accessible from the private data netwo…' wikitext text/x-wiki == Cair data processing server == There is now a dedicated file server for cair users. The hostname is <code>cair-cluster</code> which is accessible from the private data network and the UH student network (using the FQDN <code>cair-cluster.herts.ac.uk</code>). The server is a Dell R520 with two Intel Xeon E5-2450L 1.80GHz processors and 32 GB RAM. It is connected to the "cair" InfiniBand network (192.168.4.0) via a dual-port QDR HBA. The server has ~ 77 TB of directly attached (via fibre channel) storage which has been configured to a RAID6 specification and is mounted as /cair-storage (on all cair nodes and the head node) This server can be used for post processing on large datasets. We have also enabled job submission on this server, so if preferred, cair users do not have to log on to <code>stri-cluster</code> at all. cdc9a7fbd101f5bdc1651e44ead8630ec03d25ae Jobs 0 9 279 250 2012-09-30T11:03:05Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints. Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 2d4e202726d7f95e646b7017bca438855f0e1bac 293 279 2013-01-31T15:38:56Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running OpenMP code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints. Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 279cf2f62e243aed2b09f2dcff06c8275c69bb2c 294 293 2013-01-31T15:39:35Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format. * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints. Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 4175245614aa6f63924fae71a359571f27fb3074 311 294 2013-04-11T13:43:56Z Jonnya 9 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints. Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). bc71e160b9c06a218eb3d296ec0e2a9f45b0e4fd Known problems 0 25 280 252 2012-09-30T11:05:39Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * Most of the compute nodes run FC16, but the head node, where code is compiled, is still running FC14. This can cause library incompatibilities. Let us know if this affects you and we can work around it. a33d0cd11d8901cf9ee1d5bc607fb9a7d66823e2 296 280 2013-02-08T21:25:39Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. 529c6c4254222a7fc6a4546e7d1e9f2ac3317954 Cluster bibliography 0 30 282 249 2012-11-30T11:55:40Z Mjh 2 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A, Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', '''2011''', doi:10.1016/j.jsb.2011.07.012 * Kukol A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', doi:10.1016/j.ejmech.2011.05.026 * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 df93f749958a5cdf58c7c71693b91a7b4189767c Main Page 0 1 283 262 2012-12-07T09:53:17Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] * [[Upgrade wishlist]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] 476d40722cea3afdf130b171b2bc674af5f5dddc 290 283 2012-12-22T07:36:45Z Mjh 2 /* Welcome to the cluster documentation wiki */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] * [[Upgrade wishlist]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] 5a12b4bd70b4f598a26fbe0ac925bb8cf2728969 302 290 2013-02-23T12:58:46Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] * [[Upgrade wishlist]] == Cluster basics == * [[Accounts]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] eb0e0b057a5aa50ff5cbc1ce1e41ac0fa3481cd6 308 302 2013-03-28T16:46:12Z Mjh 2 /* Cluster basics */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] * [[Upgrade wishlist]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] b2c8591be0ae5b9f963a3c14ca94f8a2670b04a7 312 308 2013-04-12T13:51:18Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] 5b7f02ea02b005a2ecda466a864fdecf00f88ab3 316 312 2013-04-22T14:33:10Z Mjh 2 /* Using the cluster */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] 85dc8721dcb74915a08d48a6c0f5af51e05380b1 Web server 0 32 289 189 2012-12-21T20:13:43Z Mjh 2 wikitext text/x-wiki The web server <tt>http://stri-cluster.herts.ac.uk/</tt> is visible inside and outside the university. If you create a directory <tt>public_html</tt> in your home directory, then its contents are visible at <tt>http://stri-cluster.herts.ac.uk/~your-username/</tt>. You can use this to export data; for large datasets, use symbolic links to a data disc. Do not rely on the long-term existence of this facility (e.g. you should not use the cluster to host your personal home page). 74a170da27194419e2bf9f4794620e2fa550a199 User:Aidan Farrow 2 42 298 2013-02-23T11:08:36Z WikiSysop 1 Creating user page with biography of new user. wikitext text/x-wiki I am a research fellow in CAIR working with numerical models of global climate and atmospheric chemistry. 4d3845a664986fec56e7f679e609e8e99e6040e2 User talk:Aidan Farrow 3 43 299 2013-02-23T11:08:36Z WikiSysop 1 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [[Help:Contents|help pages]]. Again, welcome and have fun! [[User:WikiSysop|WikiSysop]] 11:08, 23 February 2013 (GMT) 4b83af547e2c44a16866c2471cbe5d05f8d1f863 Parallelization 0 14 300 214 2013-02-23T12:00:49Z Mjh 2 wikitext text/x-wiki It is very important to realise that running a job on the cluster does not automatically give you access to all the CPU resources of the cluster (or even of the subset of the cluster you request from the [[jobs|job control system]]). There is one, and only one, situation in which you can run normal, single-threaded code on the cluster and get an improvement in performance. That is the situation in which ''you can usefully run multiple copies of the same program simultaneously without any communication between the different copies''; in other words, you are only limited by how many CPUs you can get access to. Examples might include Monte Carlo simulation or some data analysis tasks. Problems of this kind are called ''embarrassingly parallel''. Provided that your code is thread-safe &mdash; that is, it doesn't have implementation errors that prevent it being run several times simultaneously, such as use of temporary files with the same name &mdash; you can use the cluster for this sort of problem without modifying your code. You may be able to use the [[jobs|job control system]] with commands such as <tt>pbsdsh</tt>, or you may need to run an [[interactive jobs|interactive job]]. Once you need the different parts of your code to communicate, you in general ''must'' use code that is written specifically to do so. At present [[MPI]] is the method provided to do this. If you are planning to run an application written by a third party, it needs to use [[MPI]] if you expect to be able to run on more than one node. If you are running your own code, you will need to modify it, often very substantially, to allow parallel execution. '''This is your responsibility, not that of the cluster administrators.''' If you just intend to use all the processors on one node, and are not worried about explicit communication between threads, you can use [[OpenMP]], which will often be much less effort to integrate with existing code. 4a80896994505aad9c4ad76ea4f223cbdbe70834 OpenMP 0 44 301 2013-02-23T12:58:27Z Mjh 2 Created page with "OpenMP is an extension to commonly used programming languages that allows them to make use of the multiple processors ('cores') that are available on all modern PCs, including th..." wikitext text/x-wiki OpenMP is an extension to commonly used programming languages that allows them to make use of the multiple processors ('cores') that are available on all modern PCs, including the cluster nodes. In the best case, you can take existing code, add a few lines, and have parallelizable parts of your code, like loops, running on all available CPUs. Here's a simple example C program that runs multithreaded: <pre> int main(int argc, char *argv[]) { const int N = 100000; int i, a[N]; #pragma omp parallel for private(i) for (i = 0; i < N; i++) a[i] = 2 * i; return 0; } </pre> In this example, the loop is parallelized so that all available cores contribute to filling up the array. OpenMP tutorials are available online, e.g. [http://bisqwit.iki.fi/story/howto/openmp/]. The C and Fortran compilers available on the cluster support OpenMP. For example, gcc works with the flag <tt>-fopenmp</tt>: <pre> gcc -fopenmp code.c -o code </pre> If you do not compile with the correct flag set, the <tt>#pragma</tt> directives will be ignored! If you want to run OpenMP code from a job, see the relevant section of the [[Jobs]] page to make sure that you honour your allocation of CPUs. 738e8a6388012a9b0eed75086a2028e58583345e Mail 0 18 303 259 2013-02-23T13:00:00Z Mjh 2 wikitext text/x-wiki Various systems on the cluster will want to send you e-mail. By default this will be delivered to a local mailbox: you will need to read it with a command-line mail tool like <tt>mutt</tt> on the head node. You are advised to set up a <tt>.forward</tt> file in your home directory which will send mail to your normal inbox. To do this, simply create the file containing one line of text, the e-mail address to forward to. The commands below can be cut and pasted into a shell: <pre> cd cat <<END >.forward f.bloggs@herts.ac.uk END </pre> or you can edit the <tt>.forward</tt> file with your favourite editor. Please don't allow your inbox on the cluster to fill up with large messages. If you do not follow these instructions, the administrators reserve the right to delete your mailbox and/or to set up a <tt>.forward</tt> file for you themselves. e9e3cb5d4b653ed9924f3fc102826711f754f64c Acknowledgements 0 29 304 180 2013-02-25T15:48:42Z Mjh 2 wikitext text/x-wiki If possible, please say explicitly that you have used the STRI cluster in any paper you publish that makes use of results obtained using it. We don't insist on any explicit form of words, but an example might be 'This work has made use of the University of Hertfordshire Science and Technology Research Institute high-performance computing facility.' If you wish you can add a link to <tt>http://stri-cluster.herts.ac.uk/</tt>. Please also add details of any submitted, accepted or published paper using the cluster to the [[Cluster bibliography]] page. 239d2a376a91b0174e1ffcb03598ba61a990138c Administrators 0 6 305 24 2013-03-13T10:16:06Z Mjh 2 wikitext text/x-wiki == Administrators == These are currently: * John Atkinson, j.atkinson@herts.ac.uk (x3358, room E117C) * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 2E71 (Innovation Centre)). Contact us with queries. Basic support queries (e.g. account requests, difficulty logging on or using software) should be directed to John in the first instance. a70b6af6fab85d47c1bb04ba21f4e126915e546a Accounts 0 3 306 106 2013-03-28T16:41:38Z Mjh 2 wikitext text/x-wiki To get an account, speak to John Atkinson in E117C. Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research (CAR) * Members of the Centre for Atmospheric & Instrumentation Research (CAIR) * Other research-active members of the School of Physics, Astronomy and Mathematics (PAM) * Members of the School of Computer Science (CS) * Others, by special arrangement; restricted to those who have made a financial contribution to the cluster. Access is granted subject to observance of our usage [[policies]]. Accounts may be suspended or cancelled if the cluster is abused: see also our [[Account cancellation policy]]. a2b5c1d7bc64cd9c8bcdab1a29b8aa709654bc88 Account cancellation policy 0 45 307 2013-03-28T16:45:49Z Mjh 2 Created page with "The policy for account closure and deletion is as follows: * People holding accounts on the cluster who are leaving UH or who no longer require their account should let the [[ad..." wikitext text/x-wiki The policy for account closure and deletion is as follows: * People holding accounts on the cluster who are leaving UH or who no longer require their account should let the [[administrators]] know that they are doing so; supervisors are also responsible for letting us know about departing students, postdocs and visitors. * When a cluster user leaves UH, their account will be locked (i.e. the account will still exist but it will no longer be possible to log in). A grace period of up to one month will be available, on request, to allow data to be moved off the cluster. * When an account is locked, we will send an e-mail to the user (if contactable) and their supervisor (if known) telling them that the account has been locked and referring them to this policy. * Three months after a user has left, all remaining data owned by that user will be completely deleted from the system. It is the responsibility of the owner of the data, or their supervisor, to make sure that all important data has been copied elsewhere before this happens. We will not keep backups of deleted data, nor will we chase people who do not appear to have taken the necessary action. * Supervisors or other colleagues may ask to take ownership of data belonging to students or postdocs who are leaving, but they thereby also take full reponsibility for dealing with any impact it has on the system. * If a leaving student or member of staff is expected to be given visiting research fellow/lecturer/professor status, they may ask to be exempted from this policy. Other exemptions/extensions must be negotiated via senior management (e.g. the directors of CAR or CAIR, the Deans of PAM or CS, etc). dfde54e8c98130b35dc9032c1c9bf956d036371e 309 307 2013-03-28T16:46:40Z Mjh 2 wikitext text/x-wiki The policy for account closure and deletion is as follows: * People holding accounts on the cluster who are leaving UH or who no longer require their account should let the [[administrators]] know; supervisors are also responsible for letting us know about departing students, postdocs and visitors. * When a cluster user leaves UH, their account will be locked (i.e. the account will still exist but it will no longer be possible to log in). A grace period of up to one month will be available, on request, to allow data to be moved off the cluster. * When an account is locked, we will send an e-mail to the user (if contactable) and their supervisor (if known) telling them that the account has been locked and referring them to this policy. * Three months after a user has left, all remaining data owned by that user will be completely deleted from the system. It is the responsibility of the owner of the data, or their supervisor, to make sure that all important data has been copied elsewhere before this happens. We will not keep backups of deleted data, nor will we chase people who do not appear to have taken the necessary action. * Supervisors or other colleagues may ask to take ownership of data belonging to students or postdocs who are leaving, but they thereby also take full reponsibility for dealing with any impact it has on the system. * If a leaving student or member of staff is expected to be given visiting research fellow/lecturer/professor status, they may ask to be exempted from this policy. Other exemptions/extensions must be negotiated via senior management (e.g. the directors of CAR or CAIR, the Deans of PAM or CS, etc). db8a8c994225b20da4ba6f76905f81c36c6acf28 MPI 0 12 310 220 2013-04-11T12:46:02Z Gr09aag 8 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Ethernet network, and so the latency and total available bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/local/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. For documentation of <tt>mpiexec</tt>, see [http://www.osc.edu/~djohnson/mpiexec/index.php]. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). If for some reason you want to alter this default behaviour (e.g. if your MPI code needs to run multi-threaded, as would be the case if you were mixing MPI and OpenMP), then <tt>mpiexec</tt> has options to allow this: run <tt>/usr/local/bin/mpiexec</tt> on the head node with no arguments to get a list of possible options. === MPICH2 (local versions) === Locally compiled, more up-to-date versions of MPICH2 are available. To use these use <tt>modules</tt> commands: do <pre> module unload mpich2-x86_64 module load mpich2-local OR module load mpich2-intel </pre> Then <pre> which mpicc /soft/mpich2/bin/mpicc </pre> If you wish to use these permanently, then you are recommended to put these module commands in your .cshrc or .bashrc. Jobs compiled this way should also be run with /usr/local/bin/mpiexec. === MVAPICH2 === MVAPICH2 is one of the three MPI implementations installed as part of the [http://www.openfabrics.org/ OFED] Infiniband software distribution. It uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. This does not work on the smp machines (hardware incompatible), use e.g. MPICH2 there instead or use the main queue. To use MVAPICH2 do <pre> module unload mpich2-x86_64 module load mvapich2 </pre> Then you should see <pre> > which mpicc /usr/mpi/gcc/mvapich2-1.4.1/bin/mpicc </pre> <tt>mpiexec</tt> also works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/local/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === MVAPICH === This is an earlier Infiniband-aware implementation of MPI. Not recommended. === OpenMPI === This is the third implementation provided by the OFED packages; like the others, it can be selected via the [[modules]] system. In principle it should be as good as MVIPICH2 from the point of view of Infiniband use. If you want to use it you will need to integrate it with Torque yourself -- please feel free to do so and to document it here. 7b8699a565fd972e5ff9c2a57765ccb749985f6d Networking 0 10 313 245 2013-04-15T11:43:12Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. The links connecting the head node and the SMP machines, and the uplinks from the two node internal ethernet stacks, are carried by multiple physical cables via link aggregation. A seperate physical ethernet network carries management traffic. The infiniband network is slightly more complex. Each chassis has an internal infiniband switch and these are all linked via two main infiniband switches. This arrangement is intended to provide redundancy and higher bandwidth between nodes in different chassis. chassis1-3 use DDR infiniband; all other machines on the network have QDR infiniband cards. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency and data transfer rates are somewhat higher between nodes in the same chassis than between different chassis in the same cluster, and ethernet connections between the two sub-clusters are higher-latency and lower-bandwidth still. Best results will be obtained running jobs within a single chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The two SMP machines have addresses smp1.data, smp1.infi etc. c5c5f682db0effed1c93288ae2f7197f78665f33 Reservations 0 46 314 2013-04-22T14:32:08Z Mjh 2 Created page with "It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You ca..." wikitext text/x-wiki It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You can request a reservation from the [[administrators]] or, if you have a particular need to do this often, we can set things up so that you can create your own reservations automatically. Currently members of the CTCA group are able to reserve the smp machines themselves by using the command <tt>sudo -u maui /usr/local/bin/reserve-ctca.sh [machine name] [start time] [end time]</tt>, where <tt>[start time]</tt> and <tt>[end time]</tt> have the format [HH[:MM[:SS]]][_MO[/DD[/YY]]], e.g. 14:30_06/20 means 14:30 on the 20th of June this year. 2ea9895e49c1cf343f63fa19e6a60ffdd7396cdf 315 314 2013-04-22T14:32:43Z Mjh 2 wikitext text/x-wiki It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You can request a reservation from the [[administrators]] or, if you have a particular need to do this often, we can set things up so that you can create your own reservations automatically. Currently members of the CTCA group are able to reserve the smp machines themselves by using the command <pre> sudo -u maui /usr/local/bin/reserve-ctca.sh [machine name] [start time] [end time]</pre> where <tt>[start time]</tt> and <tt>[end time]</tt> have the format [HH[:MM[:SS]]][_MO[/DD[/YY]]], e.g. 14:30_06/20 means 14:30 on the 20th of June this year. fbbaafdd29b846cc1fdd226c9157069785993f85 317 315 2013-04-22T14:34:18Z Mjh 2 wikitext text/x-wiki It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You can request a reservation from the [[administrators]] or, if you have a particular need to do this often, we can set things up so that you can create your own reservations automatically. Please only ask for a reservation if you believe that there is no method of doing what you want within the standard [[Jobs|queueing system]]. Currently members of the CTCA group are able to reserve the smp machines themselves by using the command <pre> sudo -u maui /usr/local/bin/reserve-ctca.sh [machine name] [start time] [end time]</pre> where <tt>[start time]</tt> and <tt>[end time]</tt> have the format [HH[:MM[:SS]]][_MO[/DD[/YY]]], e.g. 14:30_06/20 means 14:30 on the 20th of June this year. 8d095ba70468c522b7b804ec04e899a15aff0b30 318 317 2013-04-22T16:08:42Z Mjh 2 wikitext text/x-wiki It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You can request a reservation from the [[administrators]] or, if you have a particular need to do this often, we can set things up so that you can create your own reservations automatically. Please only ask for a reservation if you believe that there is no method of doing what you want within the standard [[Jobs|queueing system]]. Currently members of the CTCA group are able to reserve the smp machines themselves by using the command <pre> sudo -u maui /usr/local/bin/reserve-ctca.sh [machine name] [start time] [end time]</pre> where <tt>[start time]</tt> and <tt>[end time]</tt> have the format [HH[:MM[:SS]]][_MO[/DD[/YY]]], e.g. 14:30_06/20 means 14:30 on the 20th of June this year. General guidelines for user-created reservations are as follows: * Reserve the machine for a period when no existing jobs will be running -- use <tt>qstat</tt> to determine when this will be. * Use the reservation by submitting a job as normal. A reservation available to you will be used automatically if it is available. If you need an interactive session, use an [[interactive jobs|interactive job]]; e.g. if you have reserved smp2 for two days, do <tt>qsub -I -q smp -l nodes=smp2:ppn=48 -l walltime=48:00:00</tt>. * If you no longer need a reservation, e-mail the administrators to ask them to delete it. ad2f7aa766ddfc4f8b6ec0509b53da0362b568d9 Memory 0 36 319 225 2013-05-01T11:25:30Z Mjh 2 wikitext text/x-wiki Jobs that will use a large amount of physical memory must make sure that they request this in the job submission command or script. As described in the section on [[architecture]], the nodes of the main cluster have 24 Gb of physical memory. If the total amount of memory used by all jobs running on the nodes is larger than, or even starts to approach this value, than performance will drop, jobs may be unable to allocate memory and will eventually be killed, and the nodes may even crash (as the Linux out-of-memory killer appears often to leave a node in an unstable state). To make sure that this doesn't happen to your job (or, worse, your job causes it to happen to someone else's) you should specify the amount of physical memory used per process, if it is more than 1 Gb, the default, by using the <tt>pmem</tt> attribute in the job control system. So, for example, if you need 8 Gb of memory per process for 8 processes, an example job submission script would look like this: <pre> #!/bin/sh -f #PBS -N large-job #PBS -m abe #PBS -l nodes=8 #PBS -l walltime=00:01:00 #PBS -l pmem=8gb #PBS -k oe ... job commands go here ... </pre> This job will attempt to run 8 separate processes, but the scheduling system will not (as it would by default) place them all on the same node, since it knows that the processes would require more memory than is available. They will be placed on separate nodes where their memory requirements will not interfere with each other. It's the user's responsibility to figure out a sensible memory request for large jobs. Err on the side of generosity, but bear in mind that asking for too much memory will make the scheduling of both your and other users' jobs inefficient. Please bear in mind that the typical cluster job runs very comfortably in 1 Gb. You can see how much physical memory a running job is using by doing <tt>qstat -f <jobid></tt>: the line <tt>resources_used.mem</tt> tells you the total memory use for all processes. Users needing very large amounts of memory should consider using the [[SMP machines]], which have 256 Gb of physical memory. (If doing this, it is still sensible to use the <tt>pmem</tt> job attribute to make sure that you will not interfere with other users.) f7e7cd4642ede90e2f7c1b9cbc94c0806f028cc2 Software 0 17 321 320 2013-05-13T10:24:32Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: 3.1.0 installed in <tt>/soft/casapy-31.0.13530-002-64b</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> 53942b3fab7435325534c9c0b492b606116b7464 322 321 2013-05-13T10:24:46Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.0.7 installed in <tt>/soft/gromacs</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: 3.1.0 installed in <tt>/soft/casapy-31.0.13530-002-64b</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> d29cd42d1f44a401b2af01dbcf6f49ca958deddc 328 322 2013-06-14T13:39:07Z Akukol 3 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.5.5 installed in <tt>/soft/gromacs-new</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: 3.1.0 installed in <tt>/soft/casapy-31.0.13530-002-64b</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> 73e86f0b08afc30532943c383a45daa5fcafc35d 333 328 2013-06-25T13:24:34Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.5.5 installed in <tt>/soft/gromacs-new</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar</tt> b0e4f7225a7ee1a2de273da445156ccc5294c39b 346 333 2013-08-16T09:18:58Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.5.5 installed in <tt>/soft/gromacs-new</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar</tt> * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> de96ae074a0dbbf6d8c283eb39386d58307cb00a 364 346 2013-11-26T17:24:21Z Dbab 11 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.5.5 installed in <tt>/soft/gromacs-new</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar</tt> * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> * <u>[[neuron]]</u>: in <tt> /soft/nrn</tt> 271ba82bf221032fcf3feffd172213771628ae57 Jobs 0 9 323 311 2013-05-13T11:26:14Z Mjh 2 /* Jobs that depend on other jobs */ wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 6f17c8a4134aca53508cb960eaeb766926f59ac9 340 323 2013-07-28T16:23:09Z Mjh 2 /* Basic commands */ wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://www.clusterresources.com/torquedocs21/ Torque] (formerly known as PBS) and [http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml#submitvars here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://www.clusterresources.com/torquedocs21/commands/pbsdsh.shtml pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 456ef6f004de8e76f329cd512db523520114db2f Queues 0 15 324 297 2013-05-16T15:07:39Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 48 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 46 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 5f6933bb3074d7034c2e93277cfa4eacc31f293e 339 324 2013-07-18T15:39:30Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 56 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 46 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the two [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 4b40bed020124f8d755fb8bafdcff6117ac664c5 344 339 2013-08-13T16:44:56Z Mjh 2 wikitext text/x-wiki There are six possible job queues available on the system: * 'main' is the default queue: this submits to the 56 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 46 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 3ca84b2e818ba4ed5bcb0e87850740d16e3de376 345 344 2013-08-13T16:45:36Z Mjh 2 wikitext text/x-wiki There are six possible job queues available for general use on the system: * 'main' is the default queue: this submits to the 56 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 46 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'sandbox' submits to the [[sandbox]] machines and is for use for development and testing. The maximum wall time for this queue is 1 hour. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time for the 'sandbox' queue is also the maximum, i.e. 1 hour. The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on the main queue is 1. 02b8e5ab9d79e27a9acd1c9e88ec2931f4e1426a Reservations 0 46 325 318 2013-05-17T08:03:19Z Mjh 2 wikitext text/x-wiki It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You can request a reservation from the [[administrators]] or, if you have a particular need to do this often, we can set things up so that you can create your own reservations automatically. Please only ask for a reservation if you believe that there is no method of doing what you want within the standard [[Jobs|queueing system]]. Currently members of the CTCA group are able to reserve the smp machines themselves by using the command <pre> sudo -u maui /usr/local/bin/reserve-ctca.sh [machine name] [start time] [end time]</pre> where <tt>[start time]</tt> and <tt>[end time]</tt> have the format [HH[:MM[:SS]]][_MO[/DD[/YY]]], e.g. 14:30_06/20 means 14:30 on the 20th of June this year. Reservations will usually be for a group of people, but may be for an individual. If you need to use a group reservation, you will need to know the name of the group in question, and you will need to belong to that group. Typing <tt>groups</tt> at a shell prompt on the head node will tell you what groups you belong to. General guidelines for user-created reservations are as follows: * Reserve the machine for a period when no existing jobs will be running -- use <tt>qstat</tt> to determine when this will be. * If you are using a personal reservation, use the reservation by submitting a job as normal. A reservation available to you will be used automatically if it is available. If you need an interactive session, use an [[interactive jobs|interactive job]]; e.g. if you have reserved smp2 for two days, do <tt>qsub -I -q smp -l nodes=smp2:ppn=48 -l walltime=48:00:00</tt>. * If you are using a group reservation, specify that you want to use it by adding the option <tt>-W group_list=[groupname]</tt> to the <tt>qsub</tt> command or script. E.g. to use 8 cores of the <tt>scuba2</tt> group reservation on smp1 interactively, do <tt>qsub -W group_list=scuba2 -q smp -l nodes=smp1:ppn=8 -I</tt>. Again, the reservation will be used if the resources are available, and your job will otherwise go into the general pool. * If you no longer need a reservation, e-mail the administrators to ask them to delete it. 063d21b0d759109e2b2797971b828af16d670435 326 325 2013-05-17T08:04:04Z Mjh 2 wikitext text/x-wiki It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You can request a reservation from the [[administrators]] or, if you have a particular need to do this often, we can set things up so that you can create your own reservations automatically. Please only ask for a reservation if you believe that there is no method of doing what you want within the standard [[Jobs|queueing system]]. Currently members of the CTCA group are able to reserve the smp machines themselves by using the command <pre> sudo -u maui /usr/local/bin/reserve-ctca.sh [machine name] [start time] [end time]</pre> where <tt>[start time]</tt> and <tt>[end time]</tt> have the format [HH[:MM[:SS]]][_MO[/DD[/YY]]], e.g. 14:30_06/20 means 14:30 on the 20th of June this year. Reservations will usually be for a group of people, but may be for an individual. If you need to use a group reservation, you will need to know the name of the group in question, and you will need to belong to that group. Typing <tt>groups</tt> at a shell prompt on the head node will tell you what groups you belong to. General guidelines for reservations are as follows: * If creating a reservation yourself, reserve the machine(s) for a period when no existing jobs will be running -- use <tt>qstat</tt> to determine when this will be. * If you are using a personal reservation, use the reservation by submitting a job as normal. A reservation available to you will be used automatically if it is available. If you need an interactive session, use an [[interactive jobs|interactive job]]; e.g. if you have reserved smp2 for two days, do <tt>qsub -I -q smp -l nodes=smp2:ppn=48 -l walltime=48:00:00</tt>. * If you are using a group reservation, specify that you want to use it by adding the option <tt>-W group_list=[groupname]</tt> to the <tt>qsub</tt> command or script. E.g. to use 8 cores of the <tt>scuba2</tt> group reservation on smp1 interactively, do <tt>qsub -W group_list=scuba2 -q smp -l nodes=smp1:ppn=8 -I</tt>. Again, the reservation will be used if the resources are available, and your job will otherwise go into the general pool. * If you no longer need a reservation, e-mail the administrators to ask them to delete it. 8883aa2e86e6f2f4de796e71d208f6d7b20c9b25 Gromacs 0 19 329 105 2013-06-14T13:42:01Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. Look here for [[groperform|optimising performance]]. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # specifies user 'akukol' # set required paths: source /soft/gromacs-new/bin/GMXRC # used previously: export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' export LD_LIBRARY_PATH='/usr/mpi/gcc/mvapich2-1.6/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] db110de53076f81d5309c60e1abc500d7da1b72e 330 329 2013-06-14T13:44:28Z Akukol 3 /* How to perform a simulation with Gromacs' mdrun: */ wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. Look here for [[groperform|optimising performance]]. -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -k oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'akukol' # set required paths: source /soft/gromacs-new/bin/GMXRC # used previously: export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' export LD_LIBRARY_PATH='/usr/mpi/gcc/mvapich2-1.6/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] e4ffe0030b698f2d993191e75fae720158fa0b50 Known problems 0 25 331 296 2013-06-22T16:46:19Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * The scheduler sometimes crashes for unknown reasons causing jobs not to run. 9546f0b58cfd966b92b7eeb637e755c214d36ae3 352 331 2013-10-01T12:47:27Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * The scheduler sometimes crashes for unknown reasons causing jobs not to run. (Regularly run scripts check and restart the scheduler.) * The scheduler very occasionally will not run a job that could be run immediately in free resources. If <tt>checkjob</tt> shows 'job can run' and <tt>showstart</tt> shows an immediate start, but your job does not run, then please contact the [[administrators]]. aa7bc1fbc040b995f6fdce8e8efd5030b1a9ff8a Architecture 0 7 332 291 2013-06-22T16:53:17Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 16 Xeons (E5-2660s) 2 socket x 8-core with 32 Gb RAM and FDR Infiniband form the rest of the main cluster and some dedicated CAR nodes (chassis 9) * Two [[SMP machines]] each with 48 cores (4 sockets x 12 cores, Opteron 6174), 256 Gb RAM and QDR Infiniband (smp1 and smp2). * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 77 Tb of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 9 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development). * 345 Tb of [[storage]] attached via Fibre Channel to the head node. * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg c91df12f9070cf9edfa041e3649b64336b9fc6fb 350 332 2013-09-27T10:51:54Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 16 Xeons (E5-2660s) 2 socket x 8-core with 32 Gb RAM and FDR Infiniband form the rest of the main cluster and some dedicated CAR nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 Gb RAM and QDR Infiniband (smp1-3). * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 345 Tb of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 77 Tb of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 5 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development). * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 4629217c32135e42840ecdf9e0d91742b22c14be 351 350 2013-10-01T12:44:10Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 16 Xeons (E5-2660s) 2 socket x 8-core with 32 Gb RAM and FDR Infiniband form the rest of the main cluster and some dedicated CAIR nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 Gb RAM and QDR Infiniband (smp1-3). * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 345 Tb of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 77 Tb of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * 5 [[sandbox machines]] (lower-power machines with 1 Gb/core and no Infiniband, intended for testing and development). * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 65c25f176297776113f3d19f9168628d2d110c3c 371 351 2014-02-11T21:32:28Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 16 Xeons (E5-2660s) 2 socket x 8-core with 32 Gb RAM and FDR Infiniband form the rest of the main cluster and some dedicated CAIR nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 Gb RAM and QDR Infiniband (smp1-3). * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 345 Tb of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 77 Tb of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 4750a75ebc18777c4501bc8ebbfc994779c57749 CASA 0 28 334 268 2013-06-25T13:26:07Z Mjh 2 wikitext text/x-wiki CASA is software for radio astronomy data reduction. Various versions are installed on the cluster. The latest version is always in <tt>/soft/casapy</tt> (a symbolic link to the real directory). To use casa, do <tt>module load casa</tt> and then run it with <tt>casapy</tt>. You should not run CASA on the head node: either run it through the batch job system or use an [[interactive jobs|interactive job]]. 3a4c06990b33121bd4aaa176cc11962be150d89a LOFAR 0 47 335 2013-06-25T13:27:23Z Mjh 2 Created page with "To run LOFAR software, do <tt>module load LOFAR</tt> <tt>source /soft/lofar/lofarinit.csh</tt>" wikitext text/x-wiki To run LOFAR software, do <tt>module load LOFAR</tt> <tt>source /soft/lofar/lofarinit.csh</tt> c79c0a807adf24aed35ae8aa9e56f989c3ef43f9 336 335 2013-06-25T13:27:33Z Mjh 2 wikitext text/x-wiki To run LOFAR software, do <tt>module load lofar</tt> <tt>source /soft/lofar/lofarinit.csh</tt> 413f0927aede380eef8abca928dc01b89ec84c7d 337 336 2013-06-25T14:20:51Z Mjh 2 wikitext text/x-wiki To run LOFAR software, do <tt>module load lofar</tt> <tt>source /soft/lofar/lofarinit.csh</tt> You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> 2ab6eede147113a5479268426ca9df2931e36b01 348 337 2013-09-05T12:14:51Z Mjh 2 wikitext text/x-wiki To run LOFAR software, do <pre> module load lofar source /soft/lofar/lofarinit.csh </pre> You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> The 'new' <tt>awimager</tt> is installed. Because it uses its own (partial) installation of the LOFAR software, it needs to be run separately. Do <pre> module load awimager source /soft/awimager/lofarinit.csh </pre> to give you access to the <tt>awimager</tt> command. Note that the main difference between this version of <tt>awimager</tt> and the standard one that comes with the LOFAR software is that it runs multi-threaded using [[OpenMP]]. You will need to set up use of threads appropriately as described in the [[OpenMP]] page if you want to run this version. f3a72c670629f39526175273152a7554d24f80a2 349 348 2013-09-05T12:16:05Z Mjh 2 wikitext text/x-wiki To run LOFAR software, do <pre> module load lofar source /soft/lofar/lofarinit.csh </pre> You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> The 'new' <tt>awimager</tt> is installed. Because it uses its own (partial) installation of the LOFAR software, it needs to be run separately. Do <pre> module load awimager source /soft/awimager/lofarinit.csh </pre> to give you access to the <tt>awimager</tt> command. Note that the main difference between this version of <tt>awimager</tt> and the standard one that comes with the LOFAR software is that it runs multi-threaded using [[OpenMP]]. You will need to set up use of threads appropriately as described in the [[jobs]] page if you want to run this version. 80d586bafbe777823dcc22b8b66e29fbca12241f Access 0 5 338 251 2013-07-04T11:27:23Z Mjh 2 /* Access */ wikitext text/x-wiki == Access == The [[architecture|head node]] of the cluster is accessible from within the university by ssh to stri-cluster.herts.ac.uk, once you have an [[accounts|account]] set up. If you are working from a Unix desktop, you should be able to type <tt>ssh username@stri-cluster.herts.ac.uk</tt>. If you are using Windows, you will need a Windows ssh client: we recommend PuTTY[http://www.chiark.greenend.org.uk/~sgtatham/putty/]. Unless specific authorization from the [[administrators]] is provided to the contrary, individual compute nodes must be accessed either through batch [[jobs]] or via [[interactive jobs]] run on the head node: see also the [[policies|policy]] relating to this. dba255fefd8cfa97684f76feb3a61dbe90857467 Local disk space 0 48 341 2013-07-28T16:42:36Z Mjh 2 Created page with "The main compute nodes have a limited amount of local disk space (around 50 Gb for nodes001-080 and 110 Gb for nodes081-144). This area is mounted on /local and is only visible i..." wikitext text/x-wiki The main compute nodes have a limited amount of local disk space (around 50 Gb for nodes001-080 and 110 Gb for nodes081-144). This area is mounted on /local and is only visible internally to the node. The normal expectation is that you will use the central [[storage]] for file operations, including data processing. However, there may be circumstances where it will be useful to copy some data to the nodes, do some I/O intensive operations on it and copy it back to the storage. In this case you may use the /local area. If you want to do this, to avoid interfering with other jobs: * You ''must'' reserve the maximum amount of space that your job will use using the <tt>file</tt> option to <tt>qsub</tt>; e.g. <pre>qsub -l nodes=1,file=10gb</pre> * You must create a directory in /local in which your job will work as part of your <tt>qsub</tt> script, which will be unique to your job. For example, you might want to do <pre> mkdir /local/$PBS_JOBID cd /local/$PBS_JOBID </pre> * You must only work in this directory, and the total filespace you use must not exceed the reserved amount. * When your job is finished it must before exiting clear up the filespace used; no files must be left in <tt>/local</tt>. Note that these rules do not apply to the <tt>/scratch</tt> directories on the [[SMP machines]]. 19af43dd61db15382d7e196aeb1b26313837b2a9 Storage 0 8 342 281 2013-07-28T16:49:14Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 61 Tb of scratch available to all users, mounted as /stri-data * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 59 Tb of scratch for CAIR users only, mounted as /cair-data * 167 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). A [[quota]] system is in place on /home. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node). No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . 79ec6a6f1ecef9c8e402b7da195863ff23536919 SMP machines 0 24 343 216 2013-08-13T16:44:21Z Mjh 2 wikitext text/x-wiki The SMP machines are: * smp1, smp2: two 4-processor, 48-core systems each with 256 Gb of RAM, available for general use. The individual cores in these machines are slightly slower (~ 20% on CPU-bound floating-point operations) than the Intel CPUs of the main computing nodes. So users are recommended to use the main part of the cluster for straightforward computing requirements that do not need the special features of the SMP machines. * smp3: one 4-processor, 32-core system with 2.2-GHz E5-4620 Intel CPUs and 256 Gb RAM available to CAR users only. The big advantage of the SMP machines is the large amount of physical memory visible to all cores. This allows for multi-threaded, shared-memory applications. The SMP machines all also each have a large amount of local scratch space (10 Tb for smp1/2, 30 Tb for smp3) which is mounted as /scratch on the SMP machines and visible as /smp1, /smp2 and /smp3 on the head node. smp3 is intended for data reduction for CAR users only. Jobs may be started on the SMP machines using the <tt>smp</tt> [[queues|queue]] in Torque. b4fb272448695c83b4eee874fb41c227ebad5a30 Python packages 0 49 347 2013-08-16T09:30:53Z Mjh 2 Created page with "Local python packages installed in <tt>/soft</tt> include * astropy [http://pypi.python.org/packages/source/a/astropy/astropy-0.2.4.tar.gz] * kapteyn [http://www.astro.rug.nl/so..." wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * astropy [http://pypi.python.org/packages/source/a/astropy/astropy-0.2.4.tar.gz] * kapteyn [http://www.astro.rug.nl/software/kapteyn/] * h5py * mpi4py * hcluster Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these available. Please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. f5ef9ecb80cabba4d4e9bc3bfeb59bd6ae886a44 Why doesn't my job run? 0 37 353 260 2013-10-01T12:50:41Z Mjh 2 wikitext text/x-wiki If you submit a [[jobs|job]] and it stays in the queue, it may not be clear why nothing is happening. Obviously one possibility is that there is a problem with the cluster: but it is also possible that things are going on that you are not seeing. To ask the scheduler what is going on with a job you can use the command <tt>/usr/local/maui/bin/checkjob</tt>. A typical <tt>checkjob</tt> run might look like this (note the <tt>-v</tt> option): <pre> /usr/local/maui/bin/checkjob -v 123456 checking job 123456 (RM job '123456.stri-cluster.herts.ac.uk') State: Idle Creds: user:fred group:fred class:main qos:DEFAULT WallTime: 00:00:00 of 7:00:00:00 SubmitTime: Fri Jul 8 09:04:48 (Time Queued Total: 00:38:52 Eligible: 00:38:52) Total Tasks: 24 Req[0] TaskCount: 24 Partition: ALL Network: [NONE] Memory >= 1024M Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [main] Exec: '' ExecSize: 0 ImageSize: 0 Dedicated Resources Per Task: PROCS: 1 MEM: 1024M NodeAccess: SHARED TasksPerNode: 8 NodeCount: 3 IWD: [NONE] Executable: [NONE] Bypass: 63 StartCount: 0 PartitionMask: [ALL] Flags: RESTARTABLE PE: 24.00 StartPriority: 2513 job cannot run in partition DEFAULT (idle procs do not meet requirements : 0 of 24 procs found) idle procs: 732 feasible procs: 0 Rejection Reasons: [Features : 58][CPU : 22][State : 18][ReserveTime : 8] Detailed Node Availability Information: node001 rejected : ReserveTime node002 rejected : ReserveTime node003 rejected : ReserveTime node004 rejected : State node005 rejected : ReserveTime node006 rejected : ReserveTime node007 rejected : ReserveTime node008 rejected : ReserveTime node009 rejected : ReserveTime node010 rejected : CPU node011 rejected : CPU node012 rejected : CPU node013 rejected : State node014 rejected : CPU node015 rejected : CPU node016 rejected : CPU node017 rejected : State node018 rejected : State node019 rejected : State node020 rejected : State node021 rejected : State node022 rejected : State node023 rejected : State node024 rejected : State node025 rejected : State node026 rejected : State node027 rejected : State node028 rejected : State node029 rejected : State node030 rejected : State node031 rejected : State node032 rejected : CPU node033 rejected : CPU node034 rejected : CPU node035 rejected : CPU node036 rejected : CPU node037 rejected : CPU node038 rejected : CPU node039 rejected : CPU node040 rejected : CPU node041 rejected : State node042 rejected : CPU node043 rejected : CPU node044 rejected : CPU node045 rejected : CPU node046 rejected : CPU node047 rejected : CPU node048 rejected : CPU node049 rejected : Features node050 rejected : Features node051 rejected : Features node052 rejected : Features node053 rejected : Features node054 rejected : Features node055 rejected : Features node056 rejected : Features node057 rejected : Features node058 rejected : Features node059 rejected : Features node060 rejected : Features node061 rejected : Features node062 rejected : Features node063 rejected : Features node064 rejected : Features node065 rejected : Features node066 rejected : Features node067 rejected : Features node068 rejected : Features node069 rejected : Features node070 rejected : Features node071 rejected : Features node072 rejected : Features node073 rejected : Features node074 rejected : Features node075 rejected : Features node076 rejected : Features node077 rejected : Features node078 rejected : Features node079 rejected : Features node080 rejected : Features sandbox1 rejected : Features sandbox2 rejected : Features sandbox3 rejected : Features sandbox4 rejected : Features sandbox5 rejected : Features sandbox6 rejected : Features sandbox7 rejected : Features sandbox8 rejected : Features sandbox9 rejected : Features sandbox10 rejected : Features node081 rejected : Features node082 rejected : Features node083 rejected : Features node084 rejected : Features node085 rejected : Features node086 rejected : Features node087 rejected : Features node088 rejected : Features node089 rejected : Features node090 rejected : Features node091 rejected : Features node092 rejected : Features node093 rejected : Features node094 rejected : Features node095 rejected : Features node096 rejected : Features job cannot run in partition SMP (insufficient idle procs available: 0 < 24) </pre> How do you interpret all this output? First of all, if your output does not look anything like this, for example if you see an error message about not being able to contact a server, then the scheduler is not running or there is some other problem: see [[Known problems]]. If you need help in this situation, contact one of the [[administrators]]. Assuming your output looks like the above, first of all you should check that the details of your job agree with what you think you submitted. Check <tt>NodeCount</tt> and <tt>TasksPerNode</tt> and the <tt>WallTime</tt> request. You may also want to check the output of <tt>qstat -f <jobid></tt>. Now go down to the reason why <tt>job cannot run in partition DEFAULT</tt>. Normally, this will be as above: this is telling you that there are not enough available CPUs for your job. But suppose you think the cluster is close to empty. Why is this? Look down the list of individual nodes to see why each one is rejecting your job. The example above shows the common reasons: * Features: this means that you have asked for a feature that the node in question doesn't have. For example, jobs submitted to the main cluster will never run on the [[sandbox]] machines. It is normal for many nodes to reject your job for this reason. * State: usually this means that the node in question is marked busy by the system, which means that it is running as many jobs as possible. In this example, there are a number of busy nodes. It can also mean that nodes are down, i.e. that there is a real problem. If many nodes reject your job because of 'State' but you think they cannot be busy, check <tt>pbsnodes -l</tt> to see if they are 'down' and report a problem if so. Nodes that are 'offline' in <tt>pbsnodes -l</tt> have been taken offline by the administrators for maintenance and there is no need to report them unless you think this is an error. * CPU: the node is not busy, but it has less available CPU than you have requested. This is most likely because it is running one or more jobs. In the example above, the user has asked for 3 nodes with 8 CPUs each: the nodes rejecting the job because of CPU do not have 8 free CPUs. (If all nodes are rejecting your job for this reason, you should check whether you are asking for an impossible configuration. Jobs submitted to the main cluster asking for more than 16 CPUs will sit there forever waiting for a machine with a suitable number of cores to become available. Does your job really need 3 machines with 8 CPUs, or would it work with 24 CPUs distributed over the cluster? * ReserveTime: your job asks to run on the node at a time when it is reserved for another purpose. This could be due to another job ahead of you in the queue, or it could be a system reservation. If you see this message but there are no jobs ahead of you in the queue, check your e-mail for a recent e-mail from the adminstrators regarding scheduled downtime. If you are seeing all of these and you can convince yourself that the individual nodes are rejecting your jobs for valid reasons, then the only thing you can do is wait for the conditions that are blocking your job to clear. Only if you can't convince yourself of this should you contact the administrators. 5d2c4c3d28374bf9008d49c274c6b0b656c60aad Star-CCM+ 0 50 354 2013-10-22T10:40:10Z Jonnya 9 Created page with "Star-CCM+ is an engineering package which can be used to solve CDF problems. This guide (kindly written by Vitaly Voloshin) shows you how to use Star-CCM+ on the cluster." wikitext text/x-wiki Star-CCM+ is an engineering package which can be used to solve CDF problems. This guide (kindly written by Vitaly Voloshin) shows you how to use Star-CCM+ on the cluster. 3c97108c4214cce0c5d9bf0af2b2d3d885418309 355 354 2013-10-22T11:05:38Z Jonnya 9 wikitext text/x-wiki Star-CCM+ is an engineering package which can be used to solve CDF problems. This [http://{{SERVERNAME}}/docs/starccm.pdf guide ] (kindly written by Vitaly Voloshin) shows you how to use Star-CCM+ on the cluster. efcf9d3d81baae209b4168d4b08dd1b99326777f 356 355 2013-10-22T11:15:36Z Jonnya 9 wikitext text/x-wiki Star-CCM+ is an engineering package which can be used to solve CDF problems. This [http://{{SERVERNAME}}/docs/starccm.pdf guide] (kindly written by Vitaly Voloshin) shows you how to use Star-CCM+ on the cluster. The following files are those listed in the guide: *[http://{{SERVERNAME}}/docs/queue_set.sh queue_set.sh] *[http://{{SERVERNAME}}/docs/starccm_start.sh starccm_start.sh] *[http://{{SERVERNAME}}/docs/run.java run.java] *[http://{{SERVERNAME}}/docs/surf_mesh.java surf_mesh.java] *[http://{{SERVERNAME}}/docs/sv_mesh.java sv_mesh.java] *[http://{{SERVERNAME}}/vol_mesh.java vol_mesh.java] 9cc78ba096572c027430239b2a21be25723077f6 357 356 2013-10-22T11:16:26Z Jonnya 9 wikitext text/x-wiki Star-CCM+ is an engineering package which can be used to solve CDF problems. This [http://{{SERVERNAME}}/docs/starccm.pdf guide] (kindly written by Vitaly Voloshin) shows you how to use Star-CCM+ on the cluster. The following files are those listed in the guide: *[http://{{SERVERNAME}}/docs/queue_set.sh queue_set.sh] *[http://{{SERVERNAME}}/docs/starccm_start.sh starccm_start.sh] *[http://{{SERVERNAME}}/docs/run.java run.java] *[http://{{SERVERNAME}}/docs/surf_mesh.java surf_mesh.java] *[http://{{SERVERNAME}}/docs/sv_mesh.java sv_mesh.java] *[http://{{SERVERNAME}}/docs/vol_mesh.java vol_mesh.java] 6cc3aa20d782dad77a172c9e37929da17415175a 359 357 2013-10-22T11:32:26Z Jonnya 9 wikitext text/x-wiki Star-CCM+ is an engineering package which can be used to solve CFD problems. This [http://{{SERVERNAME}}/docs/starccm.pdf guide] (kindly written by Vitaly Voloshin) shows you how to use Star-CCM+ on the cluster. The following files are those listed in the guide: *[http://{{SERVERNAME}}/docs/queue_set.sh queue_set.sh] *[http://{{SERVERNAME}}/docs/starccm_start.sh starccm_start.sh] *[http://{{SERVERNAME}}/docs/run.java run.java] *[http://{{SERVERNAME}}/docs/surf_mesh.java surf_mesh.java] *[http://{{SERVERNAME}}/docs/sv_mesh.java sv_mesh.java] *[http://{{SERVERNAME}}/docs/vol_mesh.java vol_mesh.java] ad61c31b781a2fe2598a551994351e21bbcc3c5c Main Page 0 1 358 316 2013-10-22T11:18:31Z Jonnya 9 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Sandbox]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the clsuter]] 526ec5d92b741db3bf73fcc4e95fa587ac950fd6 370 358 2014-02-11T21:31:42Z Mjh 2 /* Using the cluster */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the clsuter]] d7372ad6f1135bac84a617fbc3c94e4f892f4c9a User:Dbab 2 51 362 2013-11-26T13:24:26Z Mjh 2 Creating user page with biography of new user. wikitext text/x-wiki da39a3ee5e6b4b0d3255bfef95601890afd80709 User talk:Dbab 3 52 363 2013-11-26T13:24:27Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [[Help:Contents|help pages]]. Again, welcome and have fun! [[User:Mjh|Mjh]] 13:24, 26 November 2013 (GMT) 9a9fb75e32ecf12a9a75fd8a7dfb57058386fefc Neuron 0 53 365 2013-11-26T17:55:13Z Dbab 11 Created page with "neuron is installed in /soft/nrn to run neuron you should have the library path on path. To do so run <pre> setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib </pre> To make t..." wikitext text/x-wiki neuron is installed in /soft/nrn to run neuron you should have the library path on path. To do so run <pre> setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib </pre> To make thi change permanent for new connections run <pre> echo 'setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib'>>~/.tcshrc </pre> after that you can run <pre> /soft/nrn/x86_64/bin/nrniv /soft/nrn/x86_64/bin/nrngui etc. </pre> 2deab1369cd333677933b42a9dfd51196db08850 366 365 2013-11-26T17:55:46Z Dbab 11 wikitext text/x-wiki neuron is installed in /soft/nrn to run neuron you should have the library path on path. To do so run <pre> setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib </pre> To make thi change permanent for new connections run <pre> echo 'setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib'>>~/.tcshrc </pre> Now you can run neuron using <pre> /soft/nrn/x86_64/bin/nrniv /soft/nrn/x86_64/bin/nrngui etc. </pre> de558cc8011e5817cbfacfd4fabca572876b745c 367 366 2013-11-26T17:58:55Z Dbab 11 wikitext text/x-wiki neuron is installed in /soft/nrn to run neuron you should have the library path on path. To do so run <pre> setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib </pre> To make thi change permanent for new connections run <pre> echo 'setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib'>>~/.tcshrc </pre> Now you can run neuron using <pre> /soft/nrn/x86_64/bin/nrniv /soft/nrn/x86_64/bin/nrngui etc. </pre> To run experiments you need to run it through [[Jobs]] though. b5f308ff5bef18993bbbbbad238a467c414b2b2b 368 367 2013-11-26T17:59:58Z Dbab 11 wikitext text/x-wiki neuron is installed in /soft/nrn to run neuron you should have the library path on path. To do so run <pre> setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib </pre> To make thi change permanent for new connections run <pre> echo 'setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib'>>~/.tcshrc </pre> Now you can run neuron using <pre> /soft/nrn/x86_64/bin/nrniv /soft/nrn/x86_64/bin/nrngui etc. </pre> But don't run experiments directly. To do so you need to use [[Jobs]]. c8d1dbf6a8e87d422ca98171b6bc370c2db55570 369 368 2013-11-26T18:00:31Z Dbab 11 wikitext text/x-wiki neuron is installed in /soft/nrn to run neuron you should have the library path on path. Set it by using <pre> setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib </pre> To make the change permanent for new connections run <pre> echo 'setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/soft/lib'>>~/.tcshrc </pre> Now you can run neuron using <pre> /soft/nrn/x86_64/bin/nrniv /soft/nrn/x86_64/bin/nrngui etc. </pre> But don't run experiments directly. To do so you need to use [[Jobs]]. f75e0340a31458f66b545c99ba56c7b41ab91a05 Vina 0 23 372 129 2014-02-20T15:09:45Z Akukol 3 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinaScreen.bash > nohup.out &' (not qsub). ''Andreas'' <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N $f -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> 38bb375b527357c5f1565e83c1f1ccc42338636e Autodock 0 22 373 149 2014-02-20T15:10:25Z Akukol 3 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. For virtual screening obtain the software [http://autodock.scripps.edu/resources/raccoon Raccoon]. This needs AutoDock tools to be installed first. Raccoon automatically generates the file vs_submit.sh. This must be edited according to your special circumstances (see below) and then started with: 'nohup vs_submit.sh &' (do not use qsub) ''Andreas'' <pre>#!/bin/bash # # Generated with Raccoon | AutoDockVS # #### PBS jobs parametersCPUT="00:20:00" WALLT="00:20:00" # << change here # # There should be no reason # for changing the following values NODES=1 PPN=1 MEM=512mb ### CUSTOM VARIABLES # # use the following line to set special options (e.g. specific queues) #OPT="-q MyPriorQueue" OPT="-j oe -N AutoDock" # join output and error, job name: Autodock # Paths for executables on the cluster # Modify them to specify custom executables to be used QSUB="qsub" # << change here AUTODOCK="/soft/autodock/autodock4" # << change here # Special path to move into before running # the screening. This is very system-specific, # so unless you're know what are you doing, # leave it as it is WORKING_PATH=`pwd` ################################################################################## ################################################################################## ####### There should be no need to modify anything below this line ############################### ################################################################################## ################################################################################## # # type $AUTODOCK &> /dev/null || { echo -e "\nError: the file [$AUTODOCK] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the AutoDock binary in the script"; echo -e "( i.e. AUTODOCK=/usr/bin/autodock4 )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } type $QSUB &> /dev/null || { echo -e "\nError: the file [$QSUB] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the Qsub command binary in the script"; echo -e "( i.e. QSUB=/usr/bin/qsub )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } echo Starting submission... for NAME in `cat jobs_list` do cd $NAME echo "#!/bin/bash" > $NAME.job echo "cd $WORKING_PATH/$NAME" >> $NAME.job echo "$AUTODOCK -p $NAME.dpf -l $NAME.dlg" >> $NAME.job chmod +x $NAME.job echo -n "Submitting $NAME : " $QSUB $OPT -l cput=$CPUT -l nodes=1:ppn=1 -l walltime=$WALLT -l mem=$MEM $NAME.job sleep 23 # << add this line to avoid flooding the cluster with 1000nds of jobs cd .. done </pre> The wait time of 23 seconds may be reduced in order to speed up the calculation. 4c97eaf011bc960cc4dd4a4a4e73907f52f7814a IGemDock 0 21 374 133 2014-02-20T15:11:21Z Akukol 3 wikitext text/x-wiki IGemDock is a user interface to [http://gemdock.life.nctu.edu.tw/dock/ Gemdock] for molecular docking. First you need to execute 'export LD_LIBRARY_PATH=/soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH' (put it in your .cshrc, so it is set automatically when you log in) Start with '/soft/iGEMDOCKv2.1-centos/bin/iGemdock' The molecular docking engine is '/soft/iGEMDOCKv2.1-centos/bin/mod_ga' Gemdock runs one process only (on one CPU core). That is the script RunGemdock.sh you need (remember to make RunGemdock.sh executable): <pre>#!/bin/sh #PBS -N GemD_comt2 #PBS -q main #PBS -l nodes=1:ppn=1 #PBS -j oe #PBS -u akukol #PBS -l walltime=250:00:00 export LD_LIBRARY_PATH /soft/iGEMDOCKv2.1-centos/bin:$LD_LIBRARY_PATH cd /home/akukol/data/vscreenTest/comt2_gemdock ### This is the command ### /usr/local/bin/mpiexec /soft/iGEMDOCKv2.1-centos/bin/mod_ga -f docking.dock ### command end ### # start with 'qsub RunGemdock.sh' </pre> ''Andreas'' f2f590a9e6b13a7a31c7b7055804eccc529a596e Gromacs 0 19 375 330 2014-02-20T15:12:12Z Akukol 3 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use the same version of Gromacs. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the number of nodes to use (#PBS -l, each node has 8 CPU cores), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Note that the default on the 'main' queue is 24 hours. If you don't specify any [[Queues|walltime]], the job will stop after 24 hours. Look here for [[groperform|optimising performance]]. ''Andreas'' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=02:00:00 #PBS -j oe #PBS -k oe #PBS -u akukol # runs a job with name 'GromacsTest' on the 'main' cluster # uses 1 node and 8 CPUs (each nodes has 8 CPUs) # set a maximum time of two hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'akukol' # set required paths: source /soft/gromacs-new/bin/GMXRC # used previously: export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' export LD_LIBRARY_PATH='/usr/mpi/gcc/mvapich2-1.6/lib:$LD_LIBRARY_PATH' # specify working directory: cd /home/akukol/groTest ### This is the command ### /usr/local/bin/mpiexec mdrun -s md_test.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] 32de01bf56a812e6390aca5705d08fadd2f0bb2e Storage 0 8 376 342 2014-02-25T16:13:59Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster is set up as follows: * 1 Tb of user home directories, mounted as /home * 61 Tb of scratch available to all users, mounted as /stri-data * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 59 Tb of scratch for CAIR users only, mounted as /cair-data * 167 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work There is also an area for software to be shared over the network, mounted as /soft. These paths should work for both the head node and the compute nodes. The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work in /home, use a symbolic link). A [[quota]] system is in place on /home. Please use the scratch space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) The [[SMP machines]] have their own local 10-Tb scratch discs: see the relevant page for more information. Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. No data area on the cluster is currently backed up. You must take responsibility for your own backups. In the longer term we expect to be able to make regular backups of /home and possibly some of /stri-data . e31315b8107b917743de187f340235ea3e8e8a08 Ramdisks 0 54 377 2014-02-25T16:16:32Z Mjh 2 Created page with "All nodes have a 16-Gb ramdisk set up by default. The normal expectation is that you will use the central [[storage]] for file operations, including data processing. However, th..." wikitext text/x-wiki All nodes have a 16-Gb ramdisk set up by default. The normal expectation is that you will use the central [[storage]] for file operations, including data processing. However, there may be circumstances where it will be useful to do local, fast file operations. If you want to do this, to avoid interfering with other jobs: * You ''must'' reserve the maximum amount of space that your job will use using the <tt>pmem</tt> option to <tt>qsub</tt>; e.g. <pre>qsub -l nodes=1,pmem=10gb</pre> * You must create a directory in /ramdisk in which your job will work as part of your <tt>qsub</tt> script, which will be unique to your job. For example, you might want to do <pre> mkdir /ramdisk/$PBS_JOBID cd /ramdisk/$PBS_JOBID </pre> * You must only work in this directory, and the total filespace you use must not exceed the reserved amount. * When your job is finished it must before exiting clear up the filespace used; no files must be left in <tt>/ramdisk</tt>. Note that /ramdisk is by nature volatile. When a machine is rebooted, the contents of /ramdisk will be irretrievably lost. 825e15d8e3de2ab9e6d4281f9a93af1203dc9abc 378 377 2014-02-25T16:18:52Z Mjh 2 wikitext text/x-wiki All nodes have a 16-Gb ramdisk set up by default. The normal expectation is that you will use the central [[storage]] for file operations, including data processing. However, there may be circumstances where it will be useful to do local, fast file operations. If you want to do this, to avoid interfering with other jobs: * You ''must'' reserve the maximum amount of space that your job will use using the <tt>pmem</tt> option to <tt>qsub</tt>; e.g. <pre>qsub -l nodes=1,pmem=10gb</pre> * You must create a directory in /ramdisk in which your job will work as part of your <tt>qsub</tt> script, which will be unique to your job. For example, you might want to do <pre> mkdir /ramdisk/$PBS_JOBID cd /ramdisk/$PBS_JOBID </pre> * You must only work in this directory, and the total filespace you use must not exceed the reserved amount. * When your job is finished it must before exiting clear up the filespace used; no files must be left in <tt>/ramdisk</tt>. Note that /ramdisk is by nature volatile. When a machine is rebooted, the contents of /ramdisk will be irretrievably lost. If you want larger, non-volatile local storage, see [[local disk space]]. 15035b4308071b478e6d293ac34932775f5f94bc LOFAR 0 47 379 349 2014-05-06T15:55:11Z Mjh 2 wikitext text/x-wiki To run LOFAR software, do <pre> module load lofar source /soft/lofar/lofarinit.csh </pre> You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> The version of the LOFAR software in /soft/lofar is an old, stable one. For more up-to-date versions look for /soft/lofar-date where date is a numeric build date. Then source /soft/lofar-date/lofarinit.csh instead. For newer LOFAR versions you may want to make sure that pyrap is on your path, so a full setup might be <pre> module load casa module load lofar source /soft/lofar-060514/lofarinit.csh setenv PYTHONPATH /soft/pyrap:$PYTHONPATH </pre> 1b05138ee613aeafa8107b4ccccbb522f813db09 385 379 2015-01-13T15:44:11Z Mjh 2 wikitext text/x-wiki To run LOFAR software, do <pre> module load lofar source /soft/lofar/lofarinit.csh </pre> You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> The version of the LOFAR software in /soft/lofar is an old, stable one. For more up-to-date versions look for /soft/lofar-date where date is a numeric build date. Then source /soft/lofar-date/lofarinit.csh instead. For newer LOFAR versions you may want to make sure that pyrap is on your path, so a full setup might be <pre> module load casa module load lofar source /soft/lofar-091114/lofarinit.csh setenv PYTHONPATH /soft/pyrap:$PYTHONPATH </pre> 0bf590861d55baf9d47c5caf237e31dc6ababad3 Queues 0 15 380 345 2014-06-06T10:09:31Z Mjh 2 wikitext text/x-wiki There are six possible job queues available for general use on the system: * 'main' is the default queue: this submits to the 56 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cmain' submits to CAR or main-cluster nodes. This queue is restricted to CAR users. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 46 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. 746d0c749b62ca2cf79cce01c7268d190083699b Software 0 17 381 364 2014-07-01T10:03:36Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.5.5 installed in <tt>/soft/gromacs-new</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/matlab/R2010a/bin</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar</tt> * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> * <u>[[neuron]]</u>: in <tt> /soft/nrn</tt> * <u>[[Miriad]]</u>: in <tt> /soft/miriad</tt> 6444bf7d0e3ddcdbdd1cee4e891b0d5d20d075d1 405 381 2015-11-26T10:47:38Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.5.5 installed in <tt>/soft/gromacs-new</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/MATLAB/R2013b/bin/Matlab</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar</tt> * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> * <u>[[neuron]]</u>: in <tt> /soft/nrn</tt> * <u>[[Miriad]]</u>: in <tt> /soft/miriad</tt> 9bedebc5b7e7d63c95cfbceb58eea05ea311efaf Miriad 0 55 382 2014-07-01T10:04:06Z Mjh 2 Created page with "To access the ATNF Miriad software do <tt>module load miriad</tt>." wikitext text/x-wiki To access the ATNF Miriad software do <tt>module load miriad</tt>. d64933ad085e4c96a7a521b3120f9ee1bcb76546 Jobs 0 9 383 340 2014-07-10T14:57:43Z Jonnya 9 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://docs.adaptivecomputing.com/torque/help.htm Torque] (formerly known as PBS) and [http://docs.adaptivecomputing.com/maui/index.php Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/qsub.htm online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qualter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/2-jobs/exportedBatchEnvVar.htm here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/pbsdsh.htm pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). b528d5cd92ae43036b16ac5d95c96b42d8c0f306 Cluster bibliography 0 30 384 282 2014-09-19T14:27:00Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Kukol A, Hughes DJ, Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, '''2014''', ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B, How the amyloid-β peptide and membranes affect each other: An extensive simulation study, '''2013''', ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A, Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', '''2011''', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 42d5895e85e422cdf2c1c458916294419e6176df Memory 0 36 386 319 2015-01-16T12:38:54Z Mjh 2 wikitext text/x-wiki Jobs that will use a large amount of physical memory must make sure that they request this in the job submission command or script. As described in the section on [[architecture]], the nodes of the main cluster have a range of different physical memory sizes. If the total amount of memory used by all jobs running on a given node is larger than, or even starts to approach this value, than performance will drop, jobs may be unable to allocate memory and will eventually be killed, and the node may even crash (as the Linux out-of-memory killer appears often to leave a node in an unstable state). To make sure that this doesn't happen to your job (or, worse, your job causes it to happen to someone else's) you should specify the amount of physical memory used per process, if it is more than 900 Mb, the default, by using the <tt>pmem</tt> attribute in the job control system. So, for example, if you need 8 Gb of memory per process for 8 processes, an example job submission script would look like this: <pre> #!/bin/sh -f #PBS -N large-job #PBS -m abe #PBS -l nodes=8 #PBS -l walltime=00:01:00 #PBS -l pmem=8gb #PBS -k oe ... job commands go here ... </pre> This job will attempt to run 8 separate processes, but the scheduling system will not (as it would by default) place them all on the same node, since it knows that the processes would require more memory than is available. They will be placed on separate nodes where their memory requirements will not interfere with each other. It's the user's responsibility to figure out a sensible memory request for large jobs. Err on the side of generosity, but bear in mind that asking for too much memory will make the scheduling of both your and other users' jobs inefficient. Please bear in mind that the typical cluster job runs very comfortably in 1 Gb. You can see how much physical memory a running job is using by doing <tt>qstat -f <jobid></tt>: the line <tt>resources_used.mem</tt> tells you the total memory use for all processes. Users needing very large amounts of memory should consider using the [[SMP machines]], which have 256 Gb of physical memory. (If doing this, it is still sensible to use the <tt>pmem</tt> job attribute to make sure that you will not interfere with other users.) 2e74eefd25801048924f81dfc831fd60db1e50ed Architecture 0 7 387 371 2015-01-16T12:40:46Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 Gb RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 Gb RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 Gb RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 Gb RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 16 Xeons (E5-2660s) 2 socket x 8-core with 16, 32 or 64 Gb RAM and FDR Infiniband form the rest of the main cluster and some dedicated CAIR nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 Gb RAM and QDR Infiniband (smp1-3). * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 345 Tb of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 77 Tb of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A separate server providing an additional 12 Tb of storage for CAIR use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg e9677324e2134df529402726ae0ab74217f71626 397 387 2015-07-26T08:47:17Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 GB RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-3). ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 Gb RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A separate server -- [[cair-forecast]] providing an additional 80 Tb of storage and processing for CAIR AQF use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 2897271c01dce4b9527ab3bedd8c4c376cb26dab 398 397 2015-07-26T08:47:52Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 GB RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-3). ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg e0d7964f0d9c8481734da524bae665ebf84b62f8 Todo 0 56 388 2015-04-08T09:56:18Z Jonnya 9 Created page with "Upgrade checklist" wikitext text/x-wiki Upgrade checklist aefc802f499ae3cb2da3a80fe892e9d17b265030 389 388 2015-04-08T10:02:46Z Jonnya 9 wikitext text/x-wiki Upgrade checklist (in no particular order); *Install new CA and generate certs; *Install slapd and import ldif. *setup denyhosts; *setup exim; *setup httpd; *setup munin; *setup ganglia; *setup torque; *setup maui; *setup iptables; *setup routing; *setup nfs shares; *copy /ewtc/fstab entries; abddae616406cf6f863f48f90a9c3222834d84ea 390 389 2015-04-08T10:06:43Z Jonnya 9 wikitext text/x-wiki Upgrade checklist (in no particular order); *Install new CA and generate certs; *Install slapd and import ldif. *setup denyhosts; *setup exim; *setup httpd and copy existng web hierarchy; *setup munin; *setup ganglia; *setup torque; *setup maui; *setup iptables; *setup routing; *setup nfs shares; *copy /etc/fstab entries; *copy /etc/rc.d/rc.local (license managers etc) *copy existng system cron jobs (home backup) *copy /root (re: scripts ) 4e6a815cbbde894b158b02b315897d19d62d5265 391 390 2015-04-08T10:08:57Z Jonnya 9 wikitext text/x-wiki Upgrade checklist (in no particular order); *test existing RAID array with cair-cluster; *install new CA and generate certs; *install slapd and import ldif. *setup denyhosts; *setup exim; *setup httpd and copy existng web hierarchy; *setup munin; *setup ganglia; *setup torque; *setup maui; *setup iptables; *setup routing; *setup nfs shares; *copy /etc/fstab entries; *copy /etc/rc.d/rc.local (license managers etc) *copy existng system cron jobs (home backup) *copy /root (re: scripts ) f373cde144710e12e9a61f57c62baf46c69da692 392 391 2015-04-08T10:22:44Z Mjh 2 wikitext text/x-wiki Upgrade checklist (in no particular order); *test existing RAID array with cair-cluster; *install new CA and generate certs; *install slapd and import ldif. *setup denyhosts; *setup exim; *setup httpd and copy existng web hierarchy; *setup wiki *setup munin; *setup ganglia; *setup torque (copy from old including spool dirs for jobs); *setup maui (copy from old); *setup iptables; *setup routing; *setup nfs shares; *setup ntp; *copy /etc/fstab entries; *copy /etc/rc.d/rc.local (license managers etc) *copy existng system cron jobs (home backup) *copy /root (re: scripts ) d56ee06ac14b2e331980255fbe1615713d73f67e 393 392 2015-04-08T16:40:21Z Jonnya 9 wikitext text/x-wiki Upgrade checklist (in no particular order); *test existing RAID array with cair-cluster; *install new CA and generate certs; *install slapd and import ldif. *setup denyhosts; *setup exim; *setup httpd and copy existng web hierarchy; *setup wiki *setup munin; *setup ganglia; *setup torque (copy from old including spool dirs for jobs); *setup maui (copy from old); *setup iptables; *setup routing; *setup nfs shares; *setup ntp; *copy /etc/fstab entries; *copy /etc/rc.d/rc.local (license managers etc) *copy existng system cron jobs (home backup) *copy /root (re: scripts ) *rkhunter/tripwire? 863ef89a5ce0089cfe2d33350a1fc39a0e748c1d 394 393 2015-04-09T08:46:11Z Mjh 2 wikitext text/x-wiki Upgrade checklist (in no particular order); *test existing RAID array with cair-cluster; *install new CA and generate certs; *install slapd and import ldif. *setup denyhosts; *setup dnsmasq *setup exim; *setup httpd and copy existng web hierarchy; *setup wiki *setup munin; *setup ganglia; *setup torque (copy from old including spool dirs for jobs); *setup maui (copy from old); *setup iptables; *setup routing; *setup nfs shares; *setup ntp; *copy /etc/fstab entries; *copy /etc/rc.d/rc.local (license managers etc) *copy existng system cron jobs (home backup) *copy /root (re: scripts ) *rkhunter/tripwire? 5959932ec9318f266687d78399a584331f628931 395 394 2015-04-09T08:48:17Z Mjh 2 wikitext text/x-wiki Upgrade checklist (in no particular order); *test existing RAID array with cair-cluster; *install new CA and generate certs; *install slapd and import ldif. *setup denyhosts; *setup dnsmasq *setup exim; *setup mysql or equivalent *setup httpd and copy existng web hierarchy; *setup wiki (in mysql??) *setup munin; *setup ganglia; *setup torque (copy from old including spool dirs for jobs); *setup maui (copy from old); *setup iptables; *setup routing; *setup nfs shares; *setup ntp; *copy /etc/fstab entries; *copy /etc/rc.d/rc.local (license managers etc) *copy existng system cron jobs (home backup) *copy /root (re: scripts ) *rkhunter/tripwire? 3b95cb4c0f19091535d1b860332d06b533c68ce3 Passwordless ssh 0 13 396 71 2015-06-19T07:22:50Z Mjh 2 wikitext text/x-wiki For some applications (including use of the [[jobs|job submission system]]) you will need to enable passwordless ssh between nodes. The simplest way of doing this is as follows: * run <tt>ssh-keygen</tt> and generate a key *with no passphrase* (just press return when prompted). Using a passphrase will not work! * cd into your <tt>~/.ssh</tt> directory. * <tt>cat id_rsa.pub >> authorized_keys</tt> * Passwordless ssh is now set up: try e.g. <tt>ssh node001 hostname</tt> to test it. Note that you are ''not'' permitted to use this to run jobs on the nodes: see [[Policies]] for more. 830c17fd7fd78c695d9139b40a5d31027a5be99f Main Page 0 1 399 370 2015-10-25T11:37:29Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the clsuter]] e0baac66f0ef7637633c5d610a688df91c2ae999 408 399 2015-12-19T09:42:50Z Mjh 2 /* How-Tos */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] 9d179d5efcdc888dd57993fbb9ff2642cc939fe2 409 408 2015-12-22T13:58:23Z Mjh 2 /* Troubleshooting */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] b7bf3bad45b01ee9553525d98678791590c1f278 LOFAR-UK Compute Facility 0 57 400 2015-10-25T11:53:20Z Mjh 2 Created page with "The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, p..." wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes with 64 GB RAM and 16 cores. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to lofar.herts.ac.uk. Data can be downloaded to the dedicated area /data/lofar/. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for infomration on LOFAR software. 0d6f37bc2d9dca59c076e5385feb6669c1311abb 401 400 2015-10-25T12:08:05Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes with 64 GB RAM and 16 cores. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to lofar.herts.ac.uk. Data can be downloaded to the dedicated area /data/lofar/. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local users need to submit jobs with the option -W group_list=lofar to make use of the reservation.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for infomration on LOFAR software. 6f10e01fb6ee603f153a93ec842d09c190c7e120 402 401 2015-10-26T11:59:29Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes with 64 GB RAM and 16 cores. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for information on LOFAR software. 1b8aa1d49d7ace2bbee907590475fa1ea34b61fe 403 402 2015-10-26T12:00:04Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes with 64 GB RAM and 16 cores. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for information on LOFAR software. 857c7df7fbe313feb5beb56209501e446923f89f 413 403 2016-01-21T10:56:59Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes with 64 GB RAM and 16 cores. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. 7b4715482b89430b800c1bbc83ec677de5561d61 420 413 2016-02-05T12:48:36Z Wwilliams 13 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes with 64 GB RAM and 16 cores. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. A description of the [[running the generic pipeline]] is available. 0f892725dcf6421d126449460c4385d48f75c0dd 422 420 2016-02-05T17:25:37Z Wwilliams 13 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes with 64 GB RAM and 16 cores. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. A description of the [[generic pipeline]] is available. 9f7e6390cfe62a310f725409785a2ad77b261459 Administrators 0 6 404 305 2015-11-17T16:57:33Z Mjh 2 /* Administrators */ wikitext text/x-wiki == Administrators == These are currently: * Leigh Smith, l.smith10@herts.ac.uk (x3358, room E117C) * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 2E71 (Innovation Centre)). Contact us with queries. Basic support queries (e.g. account requests, difficulty logging on or using software) should be directed to Leigh in the first instance. 0cd4652683a164ca3367be9b90088adc42c2b9e5 User:Asinha 2 58 406 2015-12-18T13:07:39Z Mjh 2 Creating user page for new user. wikitext text/x-wiki I'm still working on getting my doctorate in Computer Science at the University of Hertfordshire. I work in computational neuroscience and my interests include plasticity - both structural and synaptic, associative memory, recurrent spiking neural networks, bio-mimetic robotics and so on. There are quite a few other topics I muse about but I generally haven't the time to actually research them at the moment. I am currently a PhD candidate at the Biocomputation laboratory at the University of Hertfordshire. I study the capacity of associative memory in networks and the effect that plasticity has on it. 7d8a41077f4f2f8dd9b9f57459196f9d3b7c1ba2 User talk:Asinha 3 59 407 2015-12-18T13:07:39Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 13:07, 18 December 2015 (UTC) 9c73e588bdeab4962b25e0a0b626e002e52517ce Job errors 0 60 410 2015-12-22T14:03:04Z Mjh 2 Created page with "If you have requested that the job control system e-mail you on error, the most common error that you will see looks like this: <pre> Subject: PBS JOB 12345.stri-cluster.hert..." wikitext text/x-wiki If you have requested that the job control system e-mail you on error, the most common error that you will see looks like this: <pre> Subject: PBS JOB 12345.stri-cluster.herts.ac.uk From: adm@stri-cluster.herts.ac.uk To: user@stri-cluster.herts.ac.uk PBS Job Id: 12345.stri-cluster.herts.ac.uk Job Name: test.qsub Exec host: node033/4 An error has occurred processing your job, see below. Post job file processing error; job 12345.stri-cluster.herts.ac.uk on host node033/4 Unable to copy file /var/spool/torque/spool/12345.stri-cluster.herts.ac.uk.OU to user@stri-cluster.herts.ac.uk:/home/user/test.qsub.o12345 *** error from copy Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). lost connection *** end error output Output retained on that host in: /var/spool/torque/undelivered/12345.stri-cluster.herts.ac.uk.OU </pre> This is a result of not enabling [[passwordless ssh]]. The job control system tries to copy the job output using ssh and fails to do so. Please make sure you have enabled passwordless ssh before you run any jobs. 95eefa08da2a52927bd364b0b78cee01447aece5 Mail 0 18 411 303 2015-12-22T14:05:36Z Mjh 2 wikitext text/x-wiki Various systems on the cluster will want to send you e-mail. By default this will be forwarded to the e-mail address you supplied on account creation. If you wish to change where the e-mail goes, you should modify the <tt>.forward</tt> file in your home directory: this is a plain text file containing one or more e-mail addresses. Under no circumstances should you remove the file and allow e-mail to remain on the cluster. fb4aeef43b52ba6747c1aa1d4f9bbd8c567638ba Known problems 0 25 412 352 2015-12-31T09:59:43Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * The scheduler sometimes crashes for unknown reasons causing jobs not to run. (Regularly run scripts check and restart the scheduler.) * The scheduler very occasionally will not run a job that could be run immediately in free resources. If <tt>checkjob</tt> shows 'job can run' and <tt>showstart</tt> shows an immediate start, but your job does not run, then please contact the [[administrators]]. * Node specifications of the form <tt>nodes=main:ppn=16</tt> or <tt>nodes=smp:ppn=1</tt> will severely confuse the scheduler, although they are valid. Please do not use queue names in node specifications: always do something like <tt>-q main -l nodes=1:ppn=16</tt> instead. d525a00836a05f6c18b72819cde7fa154fb6c504 Herts LOFAR HBA pipeline 0 61 414 2016-01-21T11:18:15Z Mjh 2 Created page with "The Herts HBA pipeline is a set of 'pre-facet' processing steps for HBA data that are adapted to running on the Herts cluster using the [[jobs|job control system]]. All the s..." wikitext text/x-wiki The Herts HBA pipeline is a set of 'pre-facet' processing steps for HBA data that are adapted to running on the Herts cluster using the [[jobs|job control system]]. All the steps in the pipeline process are controlled by a single parameter set in a simple hierarchical format. An example parameter set is shown below. <pre> [paths] unpack=/car-data/mjh/lofar/new processed=/smp3/mjh/lofar/nw-facet work=/local/mjh [files] calibrator=L221264 target=L221266 [calibration] flagintbaselines=True skymodel=/home/mjh/lofar/surveys-pipeline/3C196PANDEYJUL14.skymodel fitcra=True notransfer=True [preflag] sbrange=0,121 antenna=CS103HBA0 [control] dryrun=False beam_applied=False </pre> All the scripts are located in <tt>/home/mjh/lofar/surveys-pipeline</tt>. You should begin by creating your own version of the config file shown here. In what follows <tt>username</tt> represents your login ID on the system. In <tt>[paths]</tt>, <tt>unpack</tt> is the directory containing the tar files for the unprocessed target and calibrator fields: <tt>processed</tt> is the directory where processed data will be stored, normally <tt>/data/lofar/username/...</tt>: <tt>work</tt> should be a local working directory on the nodes, normally <tt>/local/username</tt>. <tt>[files]</tt> should give the prefixes (LOFAR IDs) for target and calibrator. The entries in <tt>[calibration]</tt> are very important. For pre-facet calibration we want <tt>fitcra=True</tt>, <tt>notransfer=True</tt>. If you have a sky model for your calibrator it should be specified in <tt>skymodel</tt>. Any other pre-calibration activity, such as flagging the international baselines, needs to be specified here. Valid options include: * <tt>antennafix</tt>: default False, run fixbeaminfo * <tt>antennafix15</tt>: default False, run 2015 fixbeaminfo * <tt>flagbadweight</tt>: default False, flag WEIGHT_SPECTRUM>1, needed for some Cycle 2 data * <tt>rficonsole</tt>: default True, run rficonsole * <tt>skipexisting</tt>: default False, do not run if output data already exist <tt>[preflag]</tt> should initially be left empty. 84a6d6b8f3d4e30de29831ca09923273956fd221 415 414 2016-01-21T14:42:34Z Mjh 2 wikitext text/x-wiki The Herts HBA pipeline is a set of 'pre-facet' processing steps for HBA data that are adapted to running on the Herts cluster using the [[jobs|job control system]]. All the steps in the pipeline process are controlled by a single parameter set in a simple hierarchical format. An example parameter set is shown below. <pre> [paths] unpack=/car-data/mjh/lofar/new processed=/smp3/mjh/lofar/nw-facet work=/local/mjh [files] calibrator=L221264 target=L221266 [calibration] flagintbaselines=True skymodel=/home/mjh/lofar/surveys-pipeline/3C196PANDEYJUL14.skymodel fitcra=True notransfer=True [preflag] sbrange=0,121 antenna=CS103HBA0 [control] dryrun=False </pre> All the scripts are located in <tt>/home/mjh/lofar/surveys-pipeline</tt>. This description assumes that you are logged in to the LOFAR-UK head node. == Step 1: == You should begin by creating your own version of the config file shown here. In what follows <tt>username</tt> represents your login ID on the system. In <tt>[paths]</tt>, <tt>unpack</tt> is the directory containing the tar files for the unprocessed target and calibrator fields: <tt>processed</tt> is the directory where processed data will be stored, normally <tt>/data/lofar/username/...</tt>: <tt>work</tt> should be a local working directory on the nodes, normally <tt>/local/username</tt>. <tt>[files]</tt> should give the prefixes (LOFAR IDs) for target and calibrator. The entries in <tt>[calibration]</tt> are very important. For pre-facet calibration we want <tt>fitcra=True</tt>, <tt>notransfer=True</tt>. If you have a sky model for your calibrator it should be specified in <tt>skymodel</tt> &mdash; this is important for calibrators that are resolved on the long baselines like 3C196 or 3C295. Any other pre-calibration activity, such as flagging the international baselines, needs to be specified here. Valid options include: * <tt>antennafix</tt>: default False, run fixbeaminfo * <tt>antennafix15</tt>: default False, run 2015 fixbeaminfo * <tt>flagbadweight</tt>: default False, flag WEIGHT_SPECTRUM>1, needed for some Cycle 2 data * <tt>rficonsole</tt>: default True, run rficonsole * <tt>skipexisting</tt>: default False, do not run if output data already exist <tt>[preflag]</tt> should initially be left empty. <tt>[control]</tt> is used for general control options -- setting <tt>dryrun=True</tt> will mean that commands to be executed by the scripts are not run but only printed (useful for debugging purposes). == Step 2: == The next step is to calibrate the calibrator. Test the process as follows: <pre> (LOFAR setup) /home/mjh/lofar/surveys-pipeline/calib.py config.cfg subband-number </pre> LOFAR setup is as described on the [[LOFAR]] page; config.cfg should be the full path to your config file; sub-band number should be a reliable sub-band, say 200. If all is well, this will take a few minutes and will create a copy of the calibrator and target data in your <tt>processed</tt> path. Feel free to inspect the <tt>CORRECTED_DATA</tt> for the calibrator and the amp/phase solutions in the instrument table. If this single step works, you can proceed to calibrate all the data. Exit the interactive session and do <pre> qsub -t 0-365 -v CONFIG=/full/path/to/config.cfg /home/mjh/lofar/surveys-pipeline/run-calib.qsub </pre> This runs as many jobs as possible in parallel, so initially you will submit 366 separate jobs to the queue. Use <tt>qstat</tt> to check the progress of the jobs as they pass from queued (Q) to running (R) to completed (C). Each individual job takes only a few minutes. Jobs that complete immediately are a sign of problems. Check the output from these jobs, which will accumulate in your home directory. When everything is completed, check that all data have been written to the processed directory as expected. === Step 3: === The next steps set things up for Clock-TEC separation. Make sure the LOFAR scripts are on your path as usual, then <pre> setenv PYTHONPATH /home/mjh/git:/home/mjh/reinout-scripts_v3:$PYTHONPATH /home/mjh/lofar/surveys-pipeline/clocktec-prep.py file.cfg /home/mjh/lofar/surveys-pipeline/amplitudes_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/find_bad_subband.py cal.h5 </pre> The <tt>amplitudes_losoto</tt> script will generate some matrices of amplitude solutions vs time and baseline in the working directory. <tt>find_bad_subband</tt> searches these for outliers. You should verify the sub-bands you want to exclude by looking at the matrices, then add a <tt>badsblist</tt> line to the calibration section of your config file, e.g. <pre> badsblist=[267, 302, 304, 305, 306, 307, 308, 309, 310, 311] </pre> If you find bad *antennas* at this point &mdash; or antennas that are bad on many baselines &mdash; it is best to put them in <tt>preflag</tt> in the config file and redo the calibration from the end of step 2 (i.e. delete everything in your processed directory and redo the 366-band qsub). Then fit for Clock-TEC: <pre> /home/mjh/lofar/surveys-pipeline/amplitudes_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/fit_clocktec_initialguess_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/examine_npys.py file.cfg </pre> Again, look at the plots generated by these scripts. If all looks sensible (core stations have low clock offsets and clock is largely constant per antenna) then continue: <pre> /home/mjh/lofar/surveys-pipeline/find_cal_global_phaseoffset.py file.cfg /home/mjh/lofar/surveys-pipeline/make_template_parmdb.py file.cfg </pre> You may now leave the interactive session and apply the solutions to the target: <pre> qsub -t 0-366 -q main -W group_list=lofar /home/mjh/lofar/surveys-pipeline/apply-clocktec.qsub -v CONFIG=file.cfg </pre> == Step 4: == You now need to combine these individual datasets and prepare for facet calibration. TBD... b8e5d6aae870b9eb404c4ba6c5ccb1393ff2702f 416 415 2016-01-21T14:42:55Z Mjh 2 wikitext text/x-wiki The Herts HBA pipeline is a set of 'pre-facet' processing steps for HBA data that are adapted to running on the Herts cluster using the [[jobs|job control system]]. All the steps in the pipeline process are controlled by a single parameter set in a simple hierarchical format. An example parameter set covering the initial steps is shown below. <pre> [paths] unpack=/car-data/mjh/lofar/new processed=/smp3/mjh/lofar/nw-facet work=/local/mjh [files] calibrator=L221264 target=L221266 [calibration] flagintbaselines=True skymodel=/home/mjh/lofar/surveys-pipeline/3C196PANDEYJUL14.skymodel fitcra=True notransfer=True [preflag] sbrange=0,121 antenna=CS103HBA0 [control] dryrun=False </pre> All the scripts are located in <tt>/home/mjh/lofar/surveys-pipeline</tt>. This description assumes that you are logged in to the LOFAR-UK head node. == Step 1: == You should begin by creating your own version of the config file shown here. In what follows <tt>username</tt> represents your login ID on the system. In <tt>[paths]</tt>, <tt>unpack</tt> is the directory containing the tar files for the unprocessed target and calibrator fields: <tt>processed</tt> is the directory where processed data will be stored, normally <tt>/data/lofar/username/...</tt>: <tt>work</tt> should be a local working directory on the nodes, normally <tt>/local/username</tt>. <tt>[files]</tt> should give the prefixes (LOFAR IDs) for target and calibrator. The entries in <tt>[calibration]</tt> are very important. For pre-facet calibration we want <tt>fitcra=True</tt>, <tt>notransfer=True</tt>. If you have a sky model for your calibrator it should be specified in <tt>skymodel</tt> &mdash; this is important for calibrators that are resolved on the long baselines like 3C196 or 3C295. Any other pre-calibration activity, such as flagging the international baselines, needs to be specified here. Valid options include: * <tt>antennafix</tt>: default False, run fixbeaminfo * <tt>antennafix15</tt>: default False, run 2015 fixbeaminfo * <tt>flagbadweight</tt>: default False, flag WEIGHT_SPECTRUM>1, needed for some Cycle 2 data * <tt>rficonsole</tt>: default True, run rficonsole * <tt>skipexisting</tt>: default False, do not run if output data already exist <tt>[preflag]</tt> should initially be left empty. <tt>[control]</tt> is used for general control options -- setting <tt>dryrun=True</tt> will mean that commands to be executed by the scripts are not run but only printed (useful for debugging purposes). == Step 2: == The next step is to calibrate the calibrator. Test the process as follows: <pre> (LOFAR setup) /home/mjh/lofar/surveys-pipeline/calib.py config.cfg subband-number </pre> LOFAR setup is as described on the [[LOFAR]] page; config.cfg should be the full path to your config file; sub-band number should be a reliable sub-band, say 200. If all is well, this will take a few minutes and will create a copy of the calibrator and target data in your <tt>processed</tt> path. Feel free to inspect the <tt>CORRECTED_DATA</tt> for the calibrator and the amp/phase solutions in the instrument table. If this single step works, you can proceed to calibrate all the data. Exit the interactive session and do <pre> qsub -t 0-365 -v CONFIG=/full/path/to/config.cfg /home/mjh/lofar/surveys-pipeline/run-calib.qsub </pre> This runs as many jobs as possible in parallel, so initially you will submit 366 separate jobs to the queue. Use <tt>qstat</tt> to check the progress of the jobs as they pass from queued (Q) to running (R) to completed (C). Each individual job takes only a few minutes. Jobs that complete immediately are a sign of problems. Check the output from these jobs, which will accumulate in your home directory. When everything is completed, check that all data have been written to the processed directory as expected. === Step 3: === The next steps set things up for Clock-TEC separation. Make sure the LOFAR scripts are on your path as usual, then <pre> setenv PYTHONPATH /home/mjh/git:/home/mjh/reinout-scripts_v3:$PYTHONPATH /home/mjh/lofar/surveys-pipeline/clocktec-prep.py file.cfg /home/mjh/lofar/surveys-pipeline/amplitudes_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/find_bad_subband.py cal.h5 </pre> The <tt>amplitudes_losoto</tt> script will generate some matrices of amplitude solutions vs time and baseline in the working directory. <tt>find_bad_subband</tt> searches these for outliers. You should verify the sub-bands you want to exclude by looking at the matrices, then add a <tt>badsblist</tt> line to the calibration section of your config file, e.g. <pre> badsblist=[267, 302, 304, 305, 306, 307, 308, 309, 310, 311] </pre> If you find bad *antennas* at this point &mdash; or antennas that are bad on many baselines &mdash; it is best to put them in <tt>preflag</tt> in the config file and redo the calibration from the end of step 2 (i.e. delete everything in your processed directory and redo the 366-band qsub). Then fit for Clock-TEC: <pre> /home/mjh/lofar/surveys-pipeline/amplitudes_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/fit_clocktec_initialguess_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/examine_npys.py file.cfg </pre> Again, look at the plots generated by these scripts. If all looks sensible (core stations have low clock offsets and clock is largely constant per antenna) then continue: <pre> /home/mjh/lofar/surveys-pipeline/find_cal_global_phaseoffset.py file.cfg /home/mjh/lofar/surveys-pipeline/make_template_parmdb.py file.cfg </pre> You may now leave the interactive session and apply the solutions to the target: <pre> qsub -t 0-366 -q main -W group_list=lofar /home/mjh/lofar/surveys-pipeline/apply-clocktec.qsub -v CONFIG=file.cfg </pre> == Step 4: == You now need to combine these individual datasets and prepare for facet calibration. TBD... 1e9037dbd6bcfebce6ebd97ce861081887b31028 417 416 2016-01-21T14:43:21Z Mjh 2 wikitext text/x-wiki The Herts HBA pipeline is a set of 'pre-facet' processing steps for HBA data that are adapted to running on the Herts cluster using the [[jobs|job control system]]. All the steps in the pipeline process are controlled by a single parameter set in a simple hierarchical format. An example parameter set covering the initial steps is shown below. <pre> [paths] unpack=/car-data/mjh/lofar/new processed=/smp3/mjh/lofar/nw-facet work=/local/mjh [files] calibrator=L221264 target=L221266 [calibration] flagintbaselines=True skymodel=/home/mjh/lofar/surveys-pipeline/3C196PANDEYJUL14.skymodel fitcra=True notransfer=True [preflag] sbrange=0,121 antenna=CS103HBA0 [control] dryrun=False </pre> All the scripts are located in <tt>/home/mjh/lofar/surveys-pipeline</tt>. This description assumes that you are logged in to the LOFAR-UK head node. == Step 1: == You should begin by creating your own version of the config file shown here. In what follows <tt>username</tt> represents your login ID on the system. In <tt>[paths]</tt>, <tt>unpack</tt> is the directory containing the tar files for the unprocessed target and calibrator fields: <tt>processed</tt> is the directory where processed data will be stored, normally <tt>/data/lofar/username/...</tt>: <tt>work</tt> should be a local working directory on the nodes, normally <tt>/local/username</tt>. <tt>[files]</tt> should give the prefixes (LOFAR IDs) for target and calibrator. The entries in <tt>[calibration]</tt> are very important. For pre-facet calibration we want <tt>fitcra=True</tt>, <tt>notransfer=True</tt>. If you have a sky model for your calibrator it should be specified in <tt>skymodel</tt> &mdash; this is important for calibrators that are resolved on the long baselines like 3C196 or 3C295. Any other pre-calibration activity, such as flagging the international baselines, needs to be specified here. Valid options include: * <tt>antennafix</tt>: default False, run fixbeaminfo * <tt>antennafix15</tt>: default False, run 2015 fixbeaminfo * <tt>flagbadweight</tt>: default False, flag WEIGHT_SPECTRUM>1, needed for some Cycle 2 data * <tt>rficonsole</tt>: default True, run rficonsole * <tt>skipexisting</tt>: default False, do not run if output data already exist <tt>[preflag]</tt> should initially be left empty. <tt>[control]</tt> is used for general control options -- setting <tt>dryrun=True</tt> will mean that commands to be executed by the scripts are not run but only printed (useful for debugging purposes). == Step 2: == The next step is to calibrate the calibrator. Test the process as follows: <pre> (LOFAR setup) /home/mjh/lofar/surveys-pipeline/calib.py config.cfg subband-number </pre> LOFAR setup is as described on the [[LOFAR]] page; config.cfg should be the full path to your config file; sub-band number should be a reliable sub-band, say 200. If all is well, this will take a few minutes and will create a copy of the calibrator and target data in your <tt>processed</tt> path. Feel free to inspect the <tt>CORRECTED_DATA</tt> for the calibrator and the amp/phase solutions in the instrument table. If this single step works, you can proceed to calibrate all the data. Exit the interactive session and do <pre> qsub -t 0-365 -v CONFIG=/full/path/to/config.cfg /home/mjh/lofar/surveys-pipeline/run-calib.qsub </pre> This runs as many jobs as possible in parallel, so initially you will submit 366 separate jobs to the queue. Use <tt>qstat</tt> to check the progress of the jobs as they pass from queued (Q) to running (R) to completed (C). Each individual job takes only a few minutes. Jobs that complete immediately are a sign of problems. Check the output from these jobs, which will accumulate in your home directory. When everything is completed, check that all data have been written to the processed directory as expected. == Step 3: == The next steps set things up for Clock-TEC separation. Make sure the LOFAR scripts are on your path as usual, then <pre> setenv PYTHONPATH /home/mjh/git:/home/mjh/reinout-scripts_v3:$PYTHONPATH /home/mjh/lofar/surveys-pipeline/clocktec-prep.py file.cfg /home/mjh/lofar/surveys-pipeline/amplitudes_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/find_bad_subband.py cal.h5 </pre> The <tt>amplitudes_losoto</tt> script will generate some matrices of amplitude solutions vs time and baseline in the working directory. <tt>find_bad_subband</tt> searches these for outliers. You should verify the sub-bands you want to exclude by looking at the matrices, then add a <tt>badsblist</tt> line to the calibration section of your config file, e.g. <pre> badsblist=[267, 302, 304, 305, 306, 307, 308, 309, 310, 311] </pre> If you find bad *antennas* at this point &mdash; or antennas that are bad on many baselines &mdash; it is best to put them in <tt>preflag</tt> in the config file and redo the calibration from the end of step 2 (i.e. delete everything in your processed directory and redo the 366-band qsub). Then fit for Clock-TEC: <pre> /home/mjh/lofar/surveys-pipeline/amplitudes_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/fit_clocktec_initialguess_losoto.py file.cfg /home/mjh/lofar/surveys-pipeline/examine_npys.py file.cfg </pre> Again, look at the plots generated by these scripts. If all looks sensible (core stations have low clock offsets and clock is largely constant per antenna) then continue: <pre> /home/mjh/lofar/surveys-pipeline/find_cal_global_phaseoffset.py file.cfg /home/mjh/lofar/surveys-pipeline/make_template_parmdb.py file.cfg </pre> You may now leave the interactive session and apply the solutions to the target: <pre> qsub -t 0-366 -q main -W group_list=lofar /home/mjh/lofar/surveys-pipeline/apply-clocktec.qsub -v CONFIG=file.cfg </pre> == Step 4: == You now need to combine these individual datasets and prepare for facet calibration. TBD... 841d64ad2a5bbf6f703f32be2e73fa59fbbfa1e9 User:Wwilliams 2 62 418 2016-02-05T12:36:01Z Mjh 2 Creating user page for new user. wikitext text/x-wiki Dr Wendy L. Williams Postdoctoral Research Assistant Centre for Astrophysics Research School of Physics, Astronomy and Mathematics University of Hertfordshire PhD in Astronomy Leiden Observatory Leiden University Research interests: - Multi-wavelength studies of galaxy formation and evolution over cosmic time - Evolution of active galactic nuclei - low-frequency radio calibration and imaging (especially LOFAR) - radio surveys b0afe0930354bb1ad84e786949e055cb49691f62 User talk:Wwilliams 3 63 419 2016-02-05T12:36:02Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 12:36, 5 February 2016 (UTC) d06df0830827e935817c13275e68c8c4ec08d78d Generic pipeline 0 64 421 2016-02-05T17:24:56Z Wwilliams 13 created wikitext text/x-wiki == The generic pipeline == The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build, where a few patches have been applied (<tt>$LOFARROOT=/soft/lofar-070915/</tt>). * A patch has been made to allow it to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> * A patch has been made to fix a bug with output locations (in particular for calibrator diagnostic plots and npy files). This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/recipes/nodes/python_plugin.py</tt>, but should be fixed in the latest LOFAR builds (2.15+). The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-070915/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-070915/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-070915 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-070915/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note the the <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * The <tt>[remote]</tt> section with <tt>method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). == Pre-Facet calibration == Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. === Some known problems === * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 90b16b4680900b71b5d0e368cc4afbf91254e954 423 421 2016-02-05T17:26:03Z Wwilliams 13 Wwilliams moved page [[Running the generic pipeline]] to [[Generic pipeline]] wikitext text/x-wiki == The generic pipeline == The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build, where a few patches have been applied (<tt>$LOFARROOT=/soft/lofar-070915/</tt>). * A patch has been made to allow it to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> * A patch has been made to fix a bug with output locations (in particular for calibrator diagnostic plots and npy files). This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/recipes/nodes/python_plugin.py</tt>, but should be fixed in the latest LOFAR builds (2.15+). The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-070915/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-070915/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-070915 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-070915/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note the the <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * The <tt>[remote]</tt> section with <tt>method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). == Pre-Facet calibration == Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. === Some known problems === * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 90b16b4680900b71b5d0e368cc4afbf91254e954 Running the generic pipeline 0 65 424 2016-02-05T17:26:03Z Wwilliams 13 Wwilliams moved page [[Running the generic pipeline]] to [[Generic pipeline]] wikitext text/x-wiki #REDIRECT [[Generic pipeline]] 6054ba2fedbdb74c6f7c3b24e2fe395f928ce437 Generic pipeline 0 64 425 423 2016-02-05T17:27:24Z Wwilliams 13 wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build, where a few patches have been applied (<tt>$LOFARROOT=/soft/lofar-070915/</tt>). * A patch has been made to allow it to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> * A patch has been made to fix a bug with output locations (in particular for calibrator diagnostic plots and npy files). This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/recipes/nodes/python_plugin.py</tt>, but should be fixed in the latest LOFAR builds (2.15+). The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-070915/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-070915/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-070915 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-070915/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note the the <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * The <tt>[remote]</tt> section with <tt>method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> f458766d07ac14a37530bf925b0be61fd79858e8 426 425 2016-02-09T11:48:09Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build, where a few patches have been applied (<tt>$LOFARROOT=/soft/lofar-070915/</tt>). * A patch has been made to allow it to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> * A patch has been made to fix a bug with output locations (in particular for calibrator diagnostic plots and npy files). This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/recipes/nodes/python_plugin.py</tt>, but should be fixed in the latest LOFAR builds (2.15+). The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-070915/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-070915/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-070915 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-070915/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> dd58e1f25d1cb2753f03a045b1df59d6575a3e13 427 426 2016-02-16T15:40:45Z Wwilliams 13 /* The generic pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-070915/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-070915/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-070915 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-070915/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 5182c135deccf071286d65cc8234a98d0505c7f3 428 427 2016-02-16T15:41:29Z Wwilliams 13 /* The generic pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-050216/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-070915/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-070915 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-070915/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> f6ad7bcd4bd662868b96cab48bbdcc34f2532a8c 429 428 2016-02-16T15:42:11Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-050216/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-070915/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-050216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-050216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 332d43031ab57aa4b0ebc0bd40262705fc893bfc 430 429 2016-02-16T15:42:39Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-050216/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-050216/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-050216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-050216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 7d9eaccf713ad682858f1b0f8289f47a2f7a59d0 431 430 2016-02-16T15:45:35Z Wwilliams 13 add ssh fix wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-050216/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-050216/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-050216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-050216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 2aa28dbf5f5494c8faa063da08c782b6c332fc20 432 431 2016-02-16T15:47:26Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-050216/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l pmem=2gb #PBS -l walltime=24:00:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-050216/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-050216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-050216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> be10d639abc59407c674454c1bc04798b37593d4 433 432 2016-02-17T10:50:34Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-050216/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-050216/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-050216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-050216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 7e0eb197849499ec0399f7a4c708eec5a58e7c75 434 433 2016-02-19T15:44:27Z Wwilliams 13 /* The generic pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-050216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; module load lofar; source /soft/lofar-050216/lofarinit.csh; setenv PATH /home/mjh/lofar/bin:$PATH; setenv PYTHONPATH /soft/pyrap-new:$PYTHONPATH; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> Passwordless SSH access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-050216/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-050216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-050216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> f2eefcef40cde6c6721d1e069b4ae003c2727365 437 434 2016-03-03T14:27:41Z Wwilliams 13 /* The generic pipeline */ update paths to latest lofar wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> Passwordless SSH access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar ; PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:${PATH} ; . /soft/lofar-050216/lofarinit.sh; PATH=/home/mjh/lofar/bin:$PATH; export PATH; PYTHONPATH=/soft/pyrap-new:$PYTHONPATH; export PYTHONPATH ; LD_LIBRARY_PATH=/soft/pyrap-new/lib64:/soft/boost/lib:/soft/casacore-1.7.0/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-050216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-050216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 409a9b05c17026eb6ec0b3d173fe4e772da24582 438 437 2016-03-03T14:29:42Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> Passwordless SSH access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> c3efd3b5f0d494e1238bac03947916cd07be175a 439 438 2016-03-03T14:31:52Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> Passwordless SSH access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. You can make a more general submit script by passing the parset as an argument and using the <pre>-v</pre> flag on your qsub command. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> 5778aff6b610b98037cf0737e5ddfb7901bedc7c 440 439 2016-03-03T14:34:37Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> Passwordless SSH access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. The required arguments are a parset and configuration file. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> f09ab79c13c2c897d7ac2ecfdfc7076771935d4e 441 440 2016-03-03T14:37:07Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> Passwordless SSH access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> bd2e207b25de99c6776062fc1e0ccfff8a101fe5 442 441 2016-03-03T14:38:58Z Wwilliams 13 /* Some known problems */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> Passwordless SSH access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> * check you're sourcing the right lofar software version c0ab6506403c9fff0bdb3c24d6af5ed006e71485 443 442 2016-03-03T14:41:28Z Wwilliams 13 /* The generic pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> [[Passwordless_ssh]] access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> * check you're sourcing the right lofar software version 1ca99409ec408badf1d692261fed91fd66a3904d 444 443 2016-03-15T11:38:36Z Wwilliams 13 /* The generic pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-220216/</tt>). * A patch has been made to allow the pipeline to run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. This patch is in <tt>$LOFARROOT/lib64/python2.7/site-packages/lofarpipe/support/remotecommand.py</tt> The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> [[Passwordless_ssh]] access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> * check you're sourcing the right lofar software version 39ad8a92eec8eaf58d1be5584a5529d1153efc8e 455 444 2016-11-15T15:08:27Z Wwilliams 13 /* The generic pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-051116/</tt>). The pipeline can run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. The pipeline runs sends processes to other available nodes using <tt>mpiexec</tt>, but requires the <tt>openmpi</tt> version instead of what is loaded as standard on the cluster (see also [[MPI|MPI]]). For all the nodes to understand the <tt>mpiexec</tt> command that the pipeline sends via <tt>ssh</tt>, this should be set in your <tt>.cshrc</tt> file: <pre> module unload mpi/mpich-x86_64 module load mpi/openmpi-x86_64 </pre> The LOFAR software should also be sourced here, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'module load lofar ; setenv PATH /soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-220216/lofarinit.csh ; setenv PATH /home/mjh/lofar/bin:$PATH ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:/home/wwilliams/python/face-cal-scripts/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> [[Passwordless_ssh]] access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> * check you're sourcing the right lofar software version c8f846d41a9789b58fc9755085182583c9bb8686 456 455 2016-11-15T16:24:19Z Wwilliams 13 /* The generic pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-051116/</tt>). The pipeline can run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. The pipeline sends processes to other available nodes using <tt>ssh</tt>. The LOFAR software should also be sourced in your bash or csh rc file, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casa-release-4.5.0-el6:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-051116/lofarinit.csh ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> [[Passwordless_ssh]] access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = mpiexec max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = mpiexec</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> * check you're sourcing the right lofar software version b58b9bb3404b2de7b3c0d159c9147d6e6e1fc8a1 457 456 2016-11-15T16:25:20Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-051116/</tt>). The pipeline can run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. The pipeline sends processes to other available nodes using <tt>ssh</tt>. The LOFAR software should also be sourced in your bash or csh rc file, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casa-release-4.5.0-el6:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-051116/lofarinit.csh ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> [[Passwordless_ssh]] access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-220216/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-220216 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-220216/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = pbs_ssh max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = pbs_ssh</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> * check you're sourcing the right lofar software version eb2ad0ed78ec6f801c30f45371ac5a16dea04a5b 458 457 2016-11-15T16:26:22Z Wwilliams 13 /* running the pipeline */ wikitext text/x-wiki = The generic pipeline = The generic pipeline comes with LOFAR builds 2.13+. At http://www.astron.nl/citt/genericpipeline/#quick-start there is a description of how to setup and run the generic pipeline. At Herts, the pipeline is included in the most recent lofar build (2.15 : <tt>$LOFARROOT=/soft/lofar-051116/</tt>). The pipeline can run across multiple whole nodes, in that you can submit a job which will run on N nodes. The [[jobs|job control system]] will allocate the nodes to you and the generic pipeline will deal with distributing the processes to the other nodes. The pipeline sends processes to other available nodes using <tt>ssh</tt>. The LOFAR software should also be sourced in your bash or csh rc file, so that all the nodes can run the lofar commands, e.g.: <pre> alias lofar-newest 'setenv PATH /soft/casa-release-4.5.0-el6:/soft/casacore-220216/bin:${PATH} ; source /soft/lofar-051116/lofarinit.csh ; setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-220216/lib64/python2.7/site-packages:$PYTHONPATH ; setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' alias lofar-tools 'setenv PYTHONPATH /home/wwilliams/software/lib/python2.7/site-packages/:/home/wwilliams/software/anaconda2/lib/python2.7/site-packages/:$PYTHONPATH' lofar-newest lofar-tools </pre> To allow multiple versions of the pipeline on different nodes, the ssh host checking needs to be bypassed for localhost. This can be done in your <tt>.ssh/config</tt> file: <pre> Host localhost StrictHostKeyChecking no UserKnownHostsFile=/dev/null </pre> [[Passwordless_ssh]] access between the nodes needs to be set up. = running the pipeline = The pipeline can be launched by submitting a job to the job control system, e.g.: <pre> qsub -l nodes=4:ppn=16 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> where <tt>nodes=4:ppn=16</tt> will request for the job to run on 4 entire (16 cpu) nodes. Alternatively it can be run on a part of a node (being careful to ensure that your parset matches the resources you are requesting): <pre> qsub -l nodes=1:ppn=8 -W group_list=lofar ~/survey_pipeline/runfiles/qsub/run_pipe_L400115.qsub </pre> An example <tt>qsub</tt> script is: <pre> #!/bin/bash #PBS -N pipeline-L424617 #PBS -l walltime=168:00:00 #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-051116/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:/soft/lofar-051116/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> where <pre> genericpipeline.py /home/wwilliams/survey_pipeline/parset/PreFacetCal_L424617.parset -v -d -c /home/wwilliams/survey_pipeline/lpipeline.cfg </pre> is the actual call to run the generic pipeline. The required arguments are a parset and configuration file. The calls to source the LOFAR software are required if this is not set in your <tt>.bashrc</tt>. You can make a more general submit script by passing the parset as an argument and using the <tt>-v</tt> flag on your qsub command. <pre> #!/bin/bash #PBS -k o echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: ARRAYID = $PBS_ARRAYID echo ------------------------------------------------------ echo echo running generic pipeline with parset $PARSET echo echo ------------------------------------------------------ modulecmd bash load lofar PATH=/soft/casapy-42.2.30986-1-64b:/home/mjh/bin/postgresql/bin:/soft/casacore-220216/bin:${PATH} . /soft/lofar-051116/lofarinit.sh PATH=/home/mjh/lofar/bin:$PATH PYTHONPATH=/soft/pyrap-220216/usr/lib64/python2.7/site-packages:$PYTHONPATH LD_LIBRARY_PATH=/soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH genericpipeline.py $PARSET -v -d -c /home/wwilliams/survey_pipeline/lpipeline-new.cfg </pre> and e.g. <pre> qsub -N L343220-pfc -l walltime=96:00:00 -l nodes=1:ppn=16 -W group_list=lofar -v PARSET=~/survey_pipeline/parset/survey/pfc_L343220.parset ~/survey_pipeline/runfiles/qsub/run_pipe.qsub </pre> The parset is the important part which defines all the steps to be take. The configuration file should be something like: <pre> [DEFAULT] lofarroot = /soft/lofar-051116 casaroot = /soft/casacore-1.7.0 pyraproot = /soft hdf5root = wcsroot = /opt/cep/wcslib pythonpath = /soft/lofar-051116/lib64/python2.7/site-packages runtime_directory = /car-data/wwilliams/pipeline-output/ recipe_directories = [%(pythonpath)s/lofarpipe/recipes,/home/wwilliams/scripts/git/prefactor] working_directory = /car-data/wwilliams/pipeline-products/ task_files = [%(lofarroot)s/share/pipeline/tasks.cfg] [layout] job_directory = %(runtime_directory)s/%(job_name)s [cluster] clusterdesc = %(lofarroot)s/share/cep2.clusterdesc [deploy] engine_ppath = %(pythonpath)s:%(pyraproot)s/lib:/opt/cep/pythonlibs/lib/python/site-packages engine_lpath = %(lofarroot)s/lib:%(casaroot)s/lib:%(pyraproot)s/lib:%(hdf5root)s/lib:%(wcsroot)s/lib [logging] log_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/pipeline.log xml_stat_file = %(runtime_directory)s/%(job_name)s/logs/%(start_time)s/statistics.xml [feedback] # Method of providing feedback to LOFAR. # Valid options: # messagebus Send feedback and status using LCS/MessageBus # none Do NOT send feedback and status method = none [remote] method = pbs_ssh max_per_node = 16 </pre> * Note <tt>[feedback] method = none</tt> is set because the messagebus does not currently work on the Herts cluster. * <tt>[remote] method = pbs_ssh</tt> allows for the use of multiple nodes. * The <tt>recipe_directories</tt> should be modified to include the Pre-Facet-Cal recipes (<tt>/home/wwilliams/scripts/git/prefactor</tt>). = Pre-Facet calibration = Set up the pre-facet calibration pipeline * download scripts from https://github.com/lofar-astron/prefactor. In addition to the changes mentioned in the genericpipeline -> quick-start page, you need to add the Pre-Facet-Cal directory to the "recipe_directories" in the pipeline.cfg (so that the plugins that come with the Pre-Facet-Cal are found). The current cookbook-section on the pre-facet pipeline can be found at https://github.com/lofar-astron/prefactor/blob/AH-development/docs/cookbook_prefacet.pdf * Edit the Pre-Facet-Cal.parset appropriately (see notes in the parset) and run the generic pipeline with this pre-facet-calibration parset. == Some known problems == * in argument.flags there should be no spaces! e.g. <pre>h5imp_cal.argument.flags = [h5_imp_map.output.mapfile, h5imp_cal_losoto.h5]</pre> * check you're sourcing the right lofar software version 7265cc9e30d0ce6072ba0fc0251b79ef21379b48 Software 0 17 435 405 2016-02-23T13:27:09Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 4.5.5 installed in <tt>/soft/gromacs-new</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/MATLAB/R2013b/bin/Matlab</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> * <u>[[neuron]]</u>: in <tt> /soft/nrn</tt> * <u>[[Miriad]]</u>: in <tt> /soft/miriad</tt> * <u>[[ciao]]</u>: in <tt>/soft/ciao-x.x</tt> e2a56110c439ace3b692c9e1a44c3c67a61da509 468 435 2017-03-23T10:03:32Z H.patel 14 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 2016 - GPU acceleration installed in <tt>/soft/gromacs-2016-gpu</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/MATLAB/R2013b/bin/Matlab</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> * <u>[[neuron]]</u>: in <tt> /soft/nrn</tt> * <u>[[Miriad]]</u>: in <tt> /soft/miriad</tt> * <u>[[ciao]]</u>: in <tt>/soft/ciao-x.x</tt> a37c0a7568d4b27a3a4db025db079ccdbdbdf74d 471 468 2017-03-23T10:21:51Z H.patel 14 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 2016 (with GPU acceleration) installed in <tt>/soft/gromacs-2016-gpu</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/MATLAB/R2013b/bin/Matlab</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> * <u>[[neuron]]</u>: in <tt> /soft/nrn</tt> * <u>[[Miriad]]</u>: in <tt> /soft/miriad</tt> * <u>[[ciao]]</u>: in <tt>/soft/ciao-x.x</tt> 4c3a37b091c81a0551770d26649a036e3b09cd13 Ciao 0 66 436 2016-02-23T13:27:41Z Mjh 2 Created page with "CIAO is the Chandra data reduction software. Access it by doing <tt>source /soft/ciao-4.8/ciao-4.8/bin/ciao.csh</tt>" wikitext text/x-wiki CIAO is the Chandra data reduction software. Access it by doing <tt>source /soft/ciao-4.8/ciao-4.8/bin/ciao.csh</tt> 59d0f4bc229cb31f52e893e459a88a4b4aef155d Main Page 0 1 445 409 2016-04-01T20:15:18Z Mjh 2 /* Using the cluster */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the STRI cluster. If you are a cluster user, feel free to register for an account so that you can describe any method you use to do a particular thing on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] 51677982500d58a12c78358571b5d9b93dc4c73e Monitoring 0 67 446 2016-04-01T20:17:27Z Mjh 2 Created page with "Some monitoring services run on the head node. You can monitor the state of the main servers with Munin[http://stri-cluster.herts.ac.uk/munin/] and the state of the nodes and..." wikitext text/x-wiki Some monitoring services run on the head node. You can monitor the state of the main servers with Munin[http://stri-cluster.herts.ac.uk/munin/] and the state of the nodes and network with Ganglia[http://stri-cluster.herts.ac.uk/ganglia]. b97abe1eb332e3b2742c1870b48f483fb8070692 Jobs 0 9 447 383 2016-04-27T19:59:54Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://docs.adaptivecomputing.com/torque/help.htm Torque] (formerly known as PBS) and [http://docs.adaptivecomputing.com/maui/index.php Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all 48 nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/qsub.htm online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than the 8 that are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, would actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use 8 processes per node for MPI or multi-threaded code. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qalter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/2-jobs/exportedBatchEnvVar.htm here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/pbsdsh.htm pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). af1d5e98732a89478b4b925b7e6269d6ae23c771 Neuron 0 53 448 369 2016-07-29T15:16:03Z Dbab 11 wikitext text/x-wiki neuron is installed in /soft/nrn to run neuron you should have the library path on path, as well as the mpi. Set it by using <pre> setenv LD_LIBRARY_PATH /soft/lib setenv PATH ${PATH}:/soft/mpi/openmpi-1.4.3/bin/ </pre> Notice that there are many mpi installed, so you might want to pick another one To make the changes permanent, copy paste the following either add them to your .tcshrc, or copy paste the following in your terminal <pre> echo 'setenv LD_LIBRARY_PATH:/soft/lib'>>~/.tcshrc echo 'setenv PATH ${PATH}:/soft/mpi/openmpi-1.4.3/bin/'>>~/.tcshrc </pre> Now you can run neuron using <pre> /soft/nrn/x86_64/bin/nrniv /soft/nrn/x86_64/bin/nrngui etc. </pre> But don't run experiments directly. To do so you need to use [[Jobs]]. a41fe23c4a172da45c8a474734df977eaee3f692 LOFAR-UK Compute Facility 0 57 449 422 2016-08-04T16:06:53Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes mostly with 64 GB RAM and 16 cores. (Some machines have 192 or 256 GB RAM.) Reservations on the more powerful [[SMP machines]] can be made if necessary. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive jobs, as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. A description of the [[generic pipeline]] is available. a46d6a96c2282f3d09803d838e6759dbdf2fb444 452 449 2016-08-04T16:14:52Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 128 GB RAM and 16 cores, plus some number (up to 12) of compute nodes mostly with 64 GB RAM and 16 cores. (Some machines have 192 or 256 GB RAM.) Reservations on the more powerful [[SMP machines]] can be made if necessary. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive [[jobs]], as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. A description of the [[generic pipeline]] is available. eccb4b4909d736b86e3e86dc941a5ec871e5663b 460 452 2016-12-06T17:12:58Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the STRI cluster reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 256 GB RAM and 16 cores, plus some number (up to 12) of compute nodes mostly with 64 GB RAM and 16 cores. (Some machines have 192 or 256 GB RAM.) Reservations on the more powerful [[SMP machines]] can be made if necessary. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive [[jobs]], as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. A description of the [[generic pipeline]] is available. 3af075a16d54f16da479f42896c509ada2649fec LOFAR 0 47 450 385 2016-08-04T16:12:05Z Mjh 2 wikitext text/x-wiki You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> Then a full setup for up-to-date versions of the LOFAR software looks something like this: <pre> setenv PATH /soft/casacore-290316/bin:/soft/casa-release-4.5.0-el6:${PATH} source /soft/lofar-090616/lofarinit.csh setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages/:$PYTHONPATH setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' </pre> The LOFAR software is frequently updated. For the most up-to-date versions look for /soft/lofar-date where date is a numeric build date. Then source /soft/lofar-date/lofarinit.csh instead. 423f17b751873d26b2d10e4d2b554a9b7dfbc48d 451 450 2016-08-04T16:13:53Z Mjh 2 wikitext text/x-wiki You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> Then a full setup for up-to-date versions of the LOFAR software looks something like this: <pre> setenv PATH /soft/casacore-290316/bin:${PATH} source /soft/lofar-090616/lofarinit.csh setenv PYTHONPATH /soft/pyrap-220216/usr/lib64/python2.7/site-packages/:$PYTHONPATH setenv LD_LIBRARY_PATH /soft/boost/lib:/soft/casacore-220216/lib:$LD_LIBRARY_PATH' </pre> The LOFAR software is frequently updated. For the most up-to-date versions look for /soft/lofar-date where date is a numeric build date. Then source /soft/lofar-date/lofarinit.csh instead. For running FACTOR you will also want casapy (<tt>/soft/casa-release-4.5.0-el6</tt>) and/or wsclean (<tt>/soft/wsclean/bin</tt>) on your path as well. fb63014a96dc6e1ec8fde6c450a9cabe09a1dcbe Accounts 0 3 453 306 2016-09-05T15:45:19Z Asinha 12 wikitext text/x-wiki To get an account, speak to Leigh Smith in E117C. Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research (CAR) * Members of the Centre for Atmospheric & Instrumentation Research (CAIR) * Other research-active members of the School of Physics, Astronomy and Mathematics (PAM) * Members of the School of Computer Science (CS) * Others, by special arrangement; restricted to those who have made a financial contribution to the cluster. Access is granted subject to observance of our usage [[policies]]. Accounts may be suspended or cancelled if the cluster is abused: see also our [[Account cancellation policy]]. c09b5c4696218d930c46663ad70ef67a64a34ce4 Acknowledgements 0 29 454 304 2016-09-29T09:46:16Z Mjh 2 wikitext text/x-wiki If possible, please say explicitly that you have used the STRI cluster in any paper you publish that makes use of results obtained using it. We don't insist on any explicit form of words, but an example might be 'This work has made use of the University of Hertfordshire's high-performance computing facility.' If you wish you can add a link to <tt>http://stri-cluster.herts.ac.uk/</tt>. Please also add details of any submitted, accepted or published paper using the cluster to the [[Cluster bibliography]] page. 036e3a59a2a0a019b51851d4c8c04e70ee465178 Architecture 0 7 459 398 2016-12-06T17:12:10Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 GB RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-3). * Two SMP blades in chassis6 with 32 cores (2 sockets x 16 cores) and 256 GB RAM ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing 100 TB of storage for [[LOFAR-UK]] users * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 6709bcad63cd88bdd45d476c574fcf1e789baa0b 462 459 2016-12-06T17:16:57Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 GB RAM for user logins and development * 140 compute nodes (or just 'nodes'), as follows: ** 48 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the Main cluster (chassis 1, 2 and 3) ** 32 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (chassis 4 and 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-3). * Two SMP blades in chassis6 with 32 cores (2 sockets x 16 cores) and 256 GB RAM ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A GPU machine, gpu1 * A [[Tesla|Tesla K80]] unit attached to gpu1 * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg d4b0a9a43819f22ba9582c5eacec4ccc8adff476 SMP machines 0 24 461 343 2016-12-06T17:14:01Z Mjh 2 wikitext text/x-wiki The SMP machines are: * smp1, smp2: two 4-processor, 48-core systems each with 256 Gb of RAM, available for general use. The individual cores in these machines are slightly slower (~ 20% on CPU-bound floating-point operations) than the Intel CPUs of the main computing nodes. So users are recommended to use the main part of the cluster for straightforward computing requirements that do not need the special features of the SMP machines. * smp3: one 4-processor, 32-core system with 2.2-GHz E5-4620 Intel CPUs and 256 Gb RAM available to CAR users only. * node095, node96: two 2-processor, 16-core systems with 256 GB RAM. The big advantage of the SMP machines is the large amount of physical memory visible to all cores. This allows for multi-threaded, shared-memory applications. The SMP machines smp1-3 all also each have a large amount of local scratch space (10 Tb for smp1/2, 100 Tb for smp3) which is mounted as /scratch on the SMP machines and visible as /smp1, /smp2 and /smp3 on the head node. smp3 is intended for data reduction for CAR users only. node095 and node096 have no local scratch. Jobs may be started on the SMP machines using the <tt>smp</tt> [[queues|queue]] in Torque. f49c996bd16c5c3532de1d52d08031149e21ab1b Networking 0 10 463 313 2016-12-06T17:17:35Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. The links connecting the head node and the SMP machines, and the uplinks from the two node internal ethernet stacks, are carried by multiple physical cables via link aggregation. A seperate physical ethernet network carries management traffic. The infiniband network is slightly more complex. Each chassis has an internal infiniband switch and these are all linked via two main infiniband switches. This arrangement is intended to provide redundancy and higher bandwidth between nodes in different chassis. chassis1-3 use DDR infiniband; all other machines on the network have QDR infiniband cards. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency and data transfer rates are somewhat higher between nodes in the same chassis than between different chassis in the same cluster, and ethernet connections between the two sub-clusters are higher-latency and lower-bandwidth still. Best results will be obtained running jobs within a single chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The SMP machines have addresses smp1.data, smp1.infi etc. 8fbb6266bcdaa3b1fa6a4f1b6bc994b68c26ac4d Administrators 0 6 464 404 2017-03-03T13:15:53Z Asinha 12 wikitext text/x-wiki == Administrators == These are currently: * Leigh Smith, l.c.smith@herts.ac.uk (x3358, room E117C) * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 2E71 (Innovation Centre)). Contact us with queries. Basic support queries (e.g. account requests, difficulty logging on or using software) should be directed to Leigh in the first instance. 8d0fa447d05f0a03683f52d82dd0ce440b3dc93c User:H.patel 2 68 465 2017-03-22T09:35:47Z Mjh 2 Creating user page for new user. wikitext text/x-wiki Hershna Patel BSc Research student (computational biochemistry and bioinformatics) Department of Biological & Environmental Science School of Life and Medical Sciences University of Hertfordshire Hatfield AL10 9AB United Kingdom Publications Patel, H., & Kukol, A. (2016). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. Journal of Negative Results in Biomedicine, 15(1), 15. Patel, H., & Kukol, A. (2016). Recent discoveries of influenza A drug target sites to combat virus replication. Biochemical Society Transactions, 44(3), 932-36. Kukol, A. & Patel, H. 2014. Influenza A nucleoprotein binding sites for antivirals: current research and future potential. Future Virology, 9(7), 625-27. a1943d3fca274a6170fd257a983e73cced96216b User talk:H.patel 3 69 466 2017-03-22T09:35:49Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 09:35, 22 March 2017 (UTC) ef38c68553f26ea0050459806776ce07e849dca1 Cluster bibliography 0 30 467 384 2017-03-22T13:52:32Z H.patel 14 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Patel, H., & Kukol, A. (2016). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. Journal of Negative Results in BioMedicine, 15(15). * Kukol A, Hughes DJ, Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, '''2014''', ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B, How the amyloid-β peptide and membranes affect each other: An extensive simulation study, '''2013''', ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A, Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', '''2011''', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', '''2011''', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 1cf18b1e168bc842d08e23320d68400417545e99 Gromacs 0 19 469 375 2017-03-23T10:17:36Z H.patel 14 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. The current version is built with GPU support which offers a significant speed up of simulation time. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use Gromacs version 2016. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Look here for [[groperform|optimising performance]]. ''Andreas/Hershna'' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q gpu #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -k oe #PBS -u hpatel # runs a job with name 'GromacsTest' on the gpu machine on the cluster # uses 1 GPU # set a maximum time of forty eight hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'hpatel' # set required paths: source /soft/gromacs-2016-gpu/bin/GMXRC # used previously: export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" # specify working directory: cd /home/hpatel/gromacsGPU ### This is the command ### gmx mdrun -s run.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] 3820d36bd0638aad68ee592dc32dfc12b8d272b2 470 469 2017-03-23T10:20:21Z H.patel 14 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use Gromacs version 2016. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Look here for [[groperform|optimising performance]]. The current version is built with GPU support which offers a significant speed up of simulation time. ''Andreas/Hershna'' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q gpu #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -k oe #PBS -u hpatel # runs a job with name 'GromacsTest' on the gpu machine on the cluster # uses 1 GPU # set a maximum time of forty eight hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'hpatel' # set required paths: source /soft/gromacs-2016-gpu/bin/GMXRC # used previously: export LD_LIBRARY_PATH='/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH' export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" # specify working directory: cd /home/hpatel/gromacsGPU ### This is the command ### gmx mdrun -s run.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] c995ee5ec6bc44157a729a41c4ac8f1d1765baaa Vina 0 23 472 372 2017-04-13T11:15:26Z H.patel 14 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinaScreen.bash > nohup.out &' (not qsub). ''Andreas'' <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N $f -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> '''If screening over 500 molecules, the following python script must be used.''' It requires a file called ‘filelist’ (a list of all molecules to be docked) in the working directory. The script needs to be run twice; first to generate the scripts to run the screening, each script will contain 100 dockings to submit as 1 job. The second time; lines 12 and 13 should be deleted, and line 15 (with the qsub in it) uncommented. Running the script again will submit the jobs. ''Hershna'' import os import time from subprocess import Popen,PIPE step=100 mypath=os.getcwd() files=open('filelist').read().splitlines() i=0 while i<len(files): c=0 # test -- write the job scripts to a file but don't run them q=Popen('cat > test'+str(i),shell=True,stdin=PIPE).stdin # actual run # q=Popen('qsub -N dock-'+str(i)+' -j oe -o /dev/null -l nodes=1:ppn=8 -l walltime=33:00:00',shell=True,stdin=PIPE).stdin q.write('#!/bin/bash\n') q.write('cd '+mypath+'\n') while c<100 and i<len(files): f=files[i] b='dock_'+f q.write('mkdir -p '+b+'\n') q.write('/soft/autodock_vina_1_1_1_linux_x86/bin/vina --config '+mypath+'/conf.txt --ligand '+mypath+'/'+f+' --out '+mypath+'/'+b+'/out.pdbqt --log '+mypath+'/'+b+'/log.txt\n') i+=1 c+=1 q.close() time.sleep(2) 978f6abfe2581390fe2aeb6778cb413eabb9b542 473 472 2017-04-13T11:19:40Z H.patel 14 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinaScreen.bash > nohup.out &' (not qsub). ''Andreas'' <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N $f -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> '''If screening over 500 molecules, the following python script must be used.''' It requires a file called ‘filelist’ (a list of all molecules to be docked) in the working directory. The script needs to be run twice; first to generate the scripts to run the screening, each script will contain 100 dockings to submit as 1 job. The second time; lines 12 and 13 should be deleted, and line 15 (with the qsub in it) uncommented. Running the script again will submit the jobs. Start with 'python run-jobs.py' ''Hershna'' import os import time from subprocess import Popen,PIPE step=100 mypath=os.getcwd() files=open('filelist').read().splitlines() i=0 while i<len(files): c=0 # test -- write the job scripts to a file but don't run them q=Popen('cat > test'+str(i),shell=True,stdin=PIPE).stdin # actual run # q=Popen('qsub -N dock-'+str(i)+' -j oe -o /dev/null -l nodes=1:ppn=8 -l walltime=33:00:00',shell=True,stdin=PIPE).stdin q.write('#!/bin/bash\n') q.write('cd '+mypath+'\n') while c<100 and i<len(files): f=files[i] b='dock_'+f q.write('mkdir -p '+b+'\n') q.write('/soft/autodock_vina_1_1_1_linux_x86/bin/vina --config '+mypath+'/conf.txt --ligand '+mypath+'/'+f+' --out '+mypath+'/'+b+'/out.pdbqt --log '+mypath+'/'+b+'/log.txt\n') i+=1 c+=1 q.close() time.sleep(2) 040b5138e3107707be78eaa32287d7c4445bc91a Vina 0 23 474 473 2017-04-13T11:28:01Z H.patel 14 wikitext text/x-wiki [http://vina.scripps.edu/ AutoDock Vina] is software for molecular docking. It is very easy to use for virtual screening with shell scripts as explained in the manual. The molecules need to be in pdbqt format. This format can be easily generated with Raccoon (see [[Autodock]]). Copy the pdbqt files, the protein-receptor and the vina configuration file (conf.txt in the example below) into the working directory. The following bash-script carries out the virtual screening. Start with 'nohup vinaScreen.bash > nohup.out &' (not qsub). ''Andreas'' <pre>#!/bin/bash # start from the folder where the .pdbqt files are located mypath=`pwd` for f in ZINC*.pdbqt; do b="dock_$f" echo Processing ligand $b mkdir -p $b echo "#!/bin/bash" > job.sh echo "cd $mypath" >> job.sh echo /soft/autodock_vina_1_1_1_linux_x86/bin/vina --config $mypath/conf.txt --ligand $mypath/$f --out $mypath/${b}/out.pdbqt --log $mypath/${b}/log.txt >> job.sh qsub -N $f -j oe -l cput=00:12:00 -l nodes=1:ppn=8 -l walltime=00:12:00 -l mem=512mb job.sh sleep 30 # reduce this waiting time in seconds to speed up done </pre> '''If screening over 500 molecules, the following python script must be used.''' It requires a file called ‘filelist’ (a list of all molecules to be docked) in the working directory. The script needs to be run twice; first to generate the scripts to run the screening, each script will contain 100 dockings to submit as 1 job. The second time; lines 12 and 13 should be deleted, and line 15 (with the qsub in it) uncommented. Running the script again will submit the jobs. Start with 'python run-jobs.py' ''Hershna'' <pre> import os import time from subprocess import Popen,PIPE step=100 mypath=os.getcwd() files=open('filelist').read().splitlines() i=0 while i<len(files): c=0 # test -- write the job scripts to a file but don't run them q=Popen('cat > test'+str(i),shell=True,stdin=PIPE).stdin # actual run # q=Popen('qsub -N dock-'+str(i)+' -j oe -o /dev/null -l nodes=1:ppn=8 -l walltime=33:00:00',shell=True,stdin=PIPE).stdin q.write('#!/bin/bash\n') q.write('cd '+mypath+'\n') while c<100 and i<len(files): f=files[i] b='dock_'+f q.write('mkdir -p '+b+'\n') q.write('/soft/autodock_vina_1_1_1_linux_x86/bin/vina --config '+mypath+'/conf.txt --ligand '+mypath+'/'+f+' --out '+mypath+'/'+b+'/out.pdbqt --log '+mypath+'/'+b+'/log.txt\n') i+=1 c+=1 q.close() time.sleep(2) </pre> 94309f3b8aa3a6fa6667c297a728d2f1cf2de530 Autodock 0 22 475 373 2017-04-13T11:39:48Z H.patel 14 wikitext text/x-wiki [http://autodock.scripps.edu/ AutoDock] is software for molecular docking. It is best used together with the graphical user interface [http://mgltools.scripps.edu/downloads AutoDock Tools]. AutoDock Tools can be installed in your home directory e.g. 'akukol/bin/MGLtools'. You need to download the file 'mgltools_x86_64Linux2_1.5.4.tar.gz'. The automatic installer does not work. For virtual screening obtain the software [http://autodock.scripps.edu/resources/raccoon Raccoon]. This needs AutoDock tools to be installed first. Raccoon automatically generates the file vs_submit.sh. This must be edited according to your special circumstances (see below) and then started with: 'nohup vs_submit.sh &' (do not use qsub) ''Andreas'' <pre>#!/bin/bash # # Generated with Raccoon | AutoDockVS # #### PBS jobs parametersCPUT="00:20:00" WALLT="00:20:00" # << change here # # There should be no reason # for changing the following values NODES=1 PPN=1 MEM=512mb ### CUSTOM VARIABLES # # use the following line to set special options (e.g. specific queues) #OPT="-q MyPriorQueue" OPT="-j oe -N AutoDock" # join output and error, job name: Autodock # Paths for executables on the cluster # Modify them to specify custom executables to be used QSUB="qsub" # << change here AUTODOCK="/soft/autodock/autodock4" # << change here # Special path to move into before running # the screening. This is very system-specific, # so unless you're know what are you doing, # leave it as it is WORKING_PATH=`pwd` ################################################################################## ################################################################################## ####### There should be no need to modify anything below this line ############################### ################################################################################## ################################################################################## # # type $AUTODOCK &> /dev/null || { echo -e "\nError: the file [$AUTODOCK] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the AutoDock binary in the script"; echo -e "( i.e. AUTODOCK=/usr/bin/autodock4 )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } type $QSUB &> /dev/null || { echo -e "\nError: the file [$QSUB] doesn't exist or is not executable\n"; echo -e "Try to specify the full path to the executable of the Qsub command binary in the script"; echo -e "( i.e. QSUB=/usr/bin/qsub )\n\n"; echo -e " [ virtuals screening submission aborted]\n" exit 1; } echo Starting submission... for NAME in `cat jobs_list` do cd $NAME echo "#!/bin/bash" > $NAME.job echo "cd $WORKING_PATH/$NAME" >> $NAME.job echo "$AUTODOCK -p $NAME.dpf -l $NAME.dlg" >> $NAME.job chmod +x $NAME.job echo -n "Submitting $NAME : " $QSUB $OPT -l cput=$CPUT -l nodes=1:ppn=1 -l walltime=$WALLT -l mem=$MEM $NAME.job sleep 23 # << add this line to avoid flooding the cluster with 1000nds of jobs cd .. done </pre> The wait time of 23 seconds may be reduced in order to speed up the calculation. '''If screening over 500 molecules, the following python script must be used.''' It requires a file called ‘JobsList’ (a list of all molecules to be docked) in the working directory. This will be automatically generated by Raccoon. The script needs to be run twice; first to generate the scripts to run the screening, each script will contain 100 dockings to submit as 1 job. The second time; lines 12 and 13 should be deleted, and line 15 (with the qsub in it) uncommented. Running the script again will submit the jobs. Start with 'python run-jobsAD4.py' ''Hershna'' <pre>import os import time from subprocess import Popen,PIPE step=100 mypath=os.getcwd() files=open('JobsList').read().splitlines() i=0 while i<len(files): c=0 # test -- write the job scripts to a file but don't run them q=Popen('cat > test'+str(i),shell=True,stdin=PIPE).stdin # actual run # q=Popen('qsub -N AD4-'+str(i)+' -j oe -o /dev/null -l nodes=1:ppn=1 -l walltime=33:00:00',shell=True,stdin=PIPE).stdin q.write('#!/bin/bash\n') while c<100 and i<len(files): f=files[i] q.write('cd '+mypath+'/'+f+'\n') q.write('/soft/autodock/autodock4 -p '+f+'.dpf -l '+f+'.dlg \n') i+=1 c+=1 q.close() time.sleep(2) </pre> 48ca91084a0d0224a0390f0dbc1d34112071c9d0 Cluster bibliography 0 30 476 467 2017-06-29T09:02:04Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''), Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011''') Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands ('''2011'''), ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 b1084dd6d4f09af3eb8893022d519f4138b8b871 Gromacs 0 19 477 470 2017-09-28T08:55:37Z Akukol 3 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use Gromacs version 2016. 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Look here for [[groperform|optimising performance]]. The current version is built with GPU support which offers a significant speed up of simulation time. Since Gromacs 2016.4 there is no need to distinguish between the GPU and non-GPU ('mpi') version. Note that all GPUs attached to the node are used automatically. The maximum walltime is 48 hours. ''Andreas/Hershna'' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q gpu #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -k oe #PBS -u hpatel # runs a job with name 'GromacsTest' on the gpu machine on the cluster # uses 1 GPU # set a maximum time of forty eight hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'hpatel' # set required paths: source /soft/gromacs-2016.4/bin/GMXRC export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" # specify working directory: cd /home/hpatel/gromacsGPU ### This is the command ### gmx mdrun -s run.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] aa2a8e15505e651c9342ca272aae5f620b95ff68 Accounts 0 3 478 453 2017-10-16T14:30:57Z Mjh 2 wikitext text/x-wiki To get an account, speak to Martin Hardcastle in 2E71 (Innovation Centre). Accounts are available to the following classes of people: * Members of the Centre for Astrophysics Research (CAR) * Members of the Centre for Atmospheric & Instrumentation Research (CAIR) * Other research-active members of the School of Physics, Astronomy and Mathematics (PAM) * Members of the School of Computer Science (CS) * Others, by special arrangement; restricted to those who have made a financial contribution to the cluster. Access is granted subject to observance of our usage [[policies]]. Accounts may be suspended or cancelled if the cluster is abused: see also our [[Account cancellation policy]]. bc61e4cb28663730cc21335e535e8e4ff0184498 479 478 2017-10-16T14:32:43Z Mjh 2 wikitext text/x-wiki To get an account, speak to Martin Hardcastle in 2E71 (Innovation Centre). Accounts are available to members of the Schools of PAM, Engineering and Computer Science, and to others by special arrangement. Access is granted subject to observance of our usage [[policies]]. Accounts may be suspended or cancelled if the cluster is abused: see also our [[Account cancellation policy]]. 1e433a7cf44548c1cb0a2992b929ab627bd3d48d Architecture 0 7 480 462 2017-11-03T09:27:04Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 GB RAM for user logins and development * compute nodes (or just 'nodes'), as follows: ** 52 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-052) ** 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node064-79: chassis 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node080-92: chassis 6) ** 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node111: chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node112-127: chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node128-140: chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-3). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A GPU machine, gpu1 * A [[Tesla|Tesla K80]] unit attached to gpu1 * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades installed in enclosures (chassis) of 16 blades each: there are 9 chassis in total. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg dd72cf123d0a0781f5023c0dee635377daf09899 493 480 2018-01-23T13:53:27Z Mjh 2 wikitext text/x-wiki The cluster consists of * a head node, which is an 2 socket x 4-core x 2 Hyperthreads Xeon-based machine with 32 GB RAM for user logins and development * compute nodes (or just 'nodes'), as follows: ** 52 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-052) ** 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node064-79: chassis 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node080-92: chassis 6) ** 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node111: chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node112-127: chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node128-140: chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-3). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A GPU machine, gpu1 * A [[Tesla|Tesla K80]] unit attached to gpu1 * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg c6d1eab2ce83ea2894b6a946912c411262c3a73c 497 493 2018-02-03T09:26:29Z Mjh 2 wikitext text/x-wiki The cluster consists of * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: old login node, now deprecated * compute nodes (or just 'nodes'), as follows: ** 52 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-052) ** 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node064-79: chassis 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node080-92: chassis 6) ** 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node111: chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node112-127: chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node128-140: chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9) * Three [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and one with 32 (4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-3). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A GPU machine, gpu1 * A [[Tesla|Tesla K80]] unit attached to gpu1 * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg a932d60addd1f987f6e08d2d4228fba25c011bf7 503 497 2018-03-10T16:29:26Z Mjh 2 wikitext text/x-wiki The cluster consists of * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: old login node, now deprecated * compute nodes (or just 'nodes'), as follows: ** 64 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-064) ** 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node065-80: chassis 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node081-92: chassis 6) ** 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9) * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A [[Tesla|Tesla S2050]] CUDA unit attached to smp1. * A GPU machine, gpu1 * A [[Tesla|Tesla K80]] unit attached to gpu1 * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 12f8cad95e7a9b583f7daf94133894bb46c1b536 511 503 2018-03-10T18:59:53Z Mjh 2 wikitext text/x-wiki The cluster consists of * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: old login node, now deprecated * compute nodes (or just 'nodes'), as follows: ** 64 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-064) ** 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node065-80: chassis 5) ** 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node081-92: chassis 6) ** 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6) ** 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7) ** 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8) ** 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9) ** 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9) * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM ** 100 TB of [[storage]] attached to the SMP machines * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], which is a 2 socket x 8 core Xeon-based machine with 32 GB RAM ('''CAIR use only'''). ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A Tesla S2050 [[GPUs|GPU]] unit attached to smp1. * A [[GPUs|GPU] machine, gpu1 * A Tesla K80 [[GPUs|GPU]] unit attached to gpu1 * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * Ethernet and infiniband switches to provide connectivity. The compute nodes are Dell blades. Nodes within a given chassis communicate via switches internal to the chassis; external switches connect the chassis together. See [[networking]] for more details. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 653013f42611725dd723c0d918cad2340531fd29 517 511 2018-03-10T21:03:08Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: old login node, now deprecated == compute nodes == * 64 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-064), in the main queue * 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node065-80: chassis 5), in the cair_l queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node081-92: chassis 6), in the cair_l and cair_s queues * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-9, file servers providing the BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg aa8fc772ffe760839a841effe8ce5dede712f2f6 518 517 2018-03-10T21:03:25Z Mjh 2 /* compute nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: old login node, now deprecated == compute nodes == * 64 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-064), in the main queue * 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node065-80: chassis 5), in the cair_l queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node081-92: chassis 6), in the cair_l and cair_s queues * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-9, file servers providing the BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 5af925dede525272340fc93fa4a5d5a8a66b7a65 Todo 0 56 481 395 2017-11-13T21:55:51Z Mjh 2 wikitext text/x-wiki 2017 To do: * Decide division of jobs between new head nodes * OS on new head nodes * Copy /home and /soft *Head node services (from old to-do list) **install new CA and generate certs; **install slapd and import ldif. **setup denyhosts; **setup dnsmasq **setup exim; **setup mysql or equivalent **setup httpd and copy existng web hierarchy; **setup wiki (in mysql??) **setup munin; **setup ganglia; **setup torque (copy from old including spool dirs for jobs); **setup maui (copy from old); **setup iptables; **setup routing; **setup nfs shares; **setup ntp; **copy /etc/fstab entries; **copy /etc/rc.d/rc.local (license managers etc) **copy existng system cron jobs (home backup) **copy /root (re: scripts ) * Full SL upgrade on all compute and server nodes & reboot * Improve infiniband topology * BeeGFS: ** back up ** move management and metadata servers ** servers to top-level Infiniband switch * Upgrade Torque? * Copy /stri-data and /cair-data to new Beegfs volumes (can't be done till after management/metadata moves) 00fbeaf135455c83d388a81346fc88d3db108b07 482 481 2017-11-13T21:59:27Z Mjh 2 wikitext text/x-wiki 2017 To do: * Decide division of jobs between new head nodes * OS on new head nodes * Copy /home and /soft *Head node services (from old to-do list) **install new CA and generate certs; **install slapd and import ldif. **setup denyhosts; **setup dnsmasq **setup exim; **setup mysql or equivalent **setup httpd and copy existng web hierarchy; **setup wiki (in mysql??) **setup munin; **setup ganglia; **setup torque (copy from old including spool dirs for jobs); **setup maui (copy from old); **setup iptables; **setup routing; **setup nfs shares; **setup ntp; **copy /etc/fstab entries; **copy /etc/rc.d/rc.local (license managers etc) **copy existng system cron jobs (home backup) **copy /root (re: scripts ) * Full SL upgrade on all compute and server nodes & reboot * Improve infiniband topology * BeeGFS: ** back up ** move management and metadata servers ** servers to top-level Infiniband switch * Upgrade Torque? * Copy /stri-data and /cair-data to new Beegfs volumes (can't be done till after management/metadata moves) * Physical removal of old head node, old storage volumes 2d6dae8a57121543fa15f9b4f6caabb000a54fa0 488 482 2017-12-11T10:14:34Z Mjh 2 wikitext text/x-wiki 2017 To do: * Decide division of jobs between new head nodes: done * OS on new head nodes: done * Copy /home and /soft *Head node services (from old to-do list) **install new CA and generate certs; **install slapd and import ldif. **setup denyhosts; **setup dnsmasq **setup exim; **setup mysql or equivalent **setup httpd and copy existng web hierarchy; **setup wiki (in mysql??) **setup munin; **setup ganglia; **setup torque (copy from old including spool dirs for jobs); **setup maui (copy from old); **setup iptables; **setup routing; **setup nfs shares; **setup ntp; **copy /etc/fstab entries; **copy /etc/rc.d/rc.local (license managers etc) **copy existng system cron jobs (home backup) **copy /root (re: scripts ) * Full SL upgrade on all compute and server nodes & reboot: all upgraded and most rebooted, need to reboot chassis7/8 * Improve infiniband topology * BeeGFS: ** back up: done ** move management and metadata servers: done ** servers to top-level Infiniband switch * Upgrade Torque? * Copy /stri-data and /cair-data to new Beegfs volumes (can't be done till after management/metadata moves) * Physical removal of old head node, old storage volumes 99e1a5561921d9c638c15a36d87fc91385f0e968 489 488 2017-12-11T19:26:55Z Mjh 2 wikitext text/x-wiki 2017 To do: * Decide division of jobs between new head nodes: done * OS on new head nodes: done * Copy /home and /soft: done *Head node services (from old to-do list) **install new CA and generate certs; **install slapd and import ldif. **setup denyhosts; **setup dnsmasq **setup exim; **setup mysql or equivalent **setup httpd and copy existng web hierarchy; **setup wiki (in mysql??) **setup munin; **setup ganglia; **setup torque (copy from old including spool dirs for jobs); **setup maui (copy from old); **setup iptables; **setup routing; **setup nfs shares; **setup ntp; **copy /etc/fstab entries; **copy /etc/rc.d/rc.local (license managers etc) **copy existng system cron jobs (home backup) **copy /root (re: scripts ) * Full SL upgrade on all compute and server nodes & reboot: all upgraded and most rebooted, need to reboot chassis7/8 * Improve infiniband topology * BeeGFS: ** back up: done ** move management and metadata servers: done ** servers to top-level Infiniband switch * Upgrade Torque? * Copy /stri-data and /cair-data to new Beegfs volumes (can't be done till after management/metadata moves) * Physical removal of old head node, old storage volumes d1778c68cccaac1bf816461f4efe792d7e8f47ee Administrators 0 6 483 464 2017-11-17T13:04:50Z Mjh 2 wikitext text/x-wiki == Administrators == These are currently: * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 2E71 (Innovation Centre)). Contact us with queries. 44925cea29e244d6743937eac5e0f73184dbb2f0 Known problems 0 25 484 412 2017-11-30T11:56:36Z Mjh 2 /* Known problems */ wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. * The scheduler sometimes crashes for unknown reasons causing jobs not to run. (Regularly run scripts check and restart the scheduler.) * The scheduler very occasionally will not run a job that could be run immediately in free resources. If <tt>checkjob</tt> shows 'job can run' and <tt>showstart</tt> shows an immediate start, but your job does not run, then please contact the [[administrators]]. * Node specifications of the form <tt>nodes=main:ppn=16</tt> or <tt>nodes=smp:ppn=1</tt> will severely confuse the scheduler, although they are valid. Please do not use queue names in node specifications: always do something like <tt>-q main -l nodes=1:ppn=16</tt> instead. == Node hardware/sw (for admin use only) == * node001 -- thermal issues at high load * node069 -- hardware failure (replace with spare) * node110-111 -- offline for beegfs testing * node112 -- infiniband not recognised 04d742aa76b444bce8a4f9f33902c67138d6da9f Storage 0 8 485 376 2017-12-06T11:00:49Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 1 Tb of user home directories, mounted as /home * Software directory /soft * 61 Tb of scratch available to all users, mounted as /stri-data * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 59 Tb of scratch for CAIR users only, mounted as /cair-data * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 309 TB of beegfs storage nominally distributed as follows: * 90 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 73 TB: LOFAR-UK, under /beegfs/local Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). f4d6beb23064e064c75fac1df0c3e830131aba29 491 485 2018-01-23T13:51:39Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 1 Tb of user home directories, mounted as /home * Software directory /soft * 61 Tb of scratch available to all users, mounted as /stri-data * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 59 Tb of scratch for CAIR users only, mounted as /cair-data * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 399 TB of beegfs storage nominally distributed as follows: * 180 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 73 TB: LOFAR-UK, under /beegfs/local Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). b3747f69cfaebbb74edd2b0dd3e879678386e803 492 491 2018-01-23T13:51:58Z Mjh 2 /* System-wide NFS storage */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 1 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 59 Tb of scratch for CAIR users only, mounted as /cair-data * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 399 TB of beegfs storage nominally distributed as follows: * 180 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 73 TB: LOFAR-UK, under /beegfs/local Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 3951888e1cb60e65062c479be34dbdf9ea17fae3 498 492 2018-02-03T09:27:10Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 1 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 59 Tb of scratch for CAIR users only, mounted as /cair-data * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 470 TB of beegfs storage nominally distributed as follows: * 180 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 144 TB: LOFAR-UK, under /beegfs/lofar Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 587253491a7fa71c3ac1618e7ea69e4ae3df65d7 515 498 2018-03-10T20:49:22Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 1 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 560 TB of beegfs storage nominally distributed as follows: * 180 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 144 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). bb898b4f7715ecaa780034c5200819e69ce6d08a Ramdisks 0 54 486 378 2017-12-06T11:04:56Z Mjh 2 wikitext text/x-wiki All nodes have a 16-Gb ramdisk set up by default. The normal expectation is that you will use the central [[storage]] for file operations, including data processing. However, there may be circumstances where it will be useful to do local, fast file operations. If you want to do this, to avoid interfering with other jobs: * You ''must'' reserve the maximum amount of space that your job will use using the <tt>pmem</tt> option to <tt>qsub</tt>; e.g. <pre>qsub -l nodes=1,pmem=10gb</pre> * You must create a directory in /dev/shm in which your job will work as part of your <tt>qsub</tt> script, which will be unique to your job. For example, you might want to do <pre> mkdir /dev/shm/$PBS_JOBID cd /dev/shm/$PBS_JOBID </pre> * You must only work in this directory, and the total filespace you use must not exceed the reserved amount. * When your job is finished it must before exiting clear up the filespace used; no files must be left in <tt>/dev/shm</tt>. Note that /dev/shm is by nature volatile. When a machine is rebooted, the contents of /dev/shm will be irretrievably lost. If you want larger, non-volatile local storage, see [[local disk space]]. 31ca04630864c03b613f085808a60cdcc5bb6657 487 486 2017-12-06T11:05:31Z Mjh 2 wikitext text/x-wiki All nodes have RAM-backed storage by default (provided by the OS). The normal expectation is that you will use the central [[storage]] for file operations, including data processing. However, there may be circumstances where it will be useful to do local, fast file operations. If you want to do this, to avoid interfering with other jobs: * You ''must'' reserve the maximum amount of space that your job will use using the <tt>pmem</tt> option to <tt>qsub</tt>; e.g. <pre>qsub -l nodes=1,pmem=10gb</pre> * You must create a directory in /dev/shm in which your job will work as part of your <tt>qsub</tt> script, which will be unique to your job. For example, you might want to do <pre> mkdir /dev/shm/$PBS_JOBID cd /dev/shm/$PBS_JOBID </pre> * You must only work in this directory, and the total filespace you use must not exceed the reserved amount. * When your job is finished it must before exiting clear up the filespace used; no files must be left in <tt>/dev/shm</tt>. Note that /dev/shm is by nature volatile. When a machine is rebooted, the contents of /dev/shm will be irretrievably lost. If you want larger, non-volatile local storage, see [[local disk space]]. 78e0c3302b6baf96755763772e26f1af02e28d35 Main Page 0 1 490 445 2018-01-23T13:48:35Z Mjh 2 /* Welcome to the cluster documentation wiki */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Known problems == * [[Known problems]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] 38b2f76d28ee115cabffc1096c9474c39ef9d7b6 504 490 2018-03-10T18:39:40Z Mjh 2 wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[Tesla]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] == Known problems == * [[Known problems]] 2955f9395ee95cd823b70383f4e18d06e27e2cbc 510 504 2018-03-10T18:58:24Z Mjh 2 /* Using the cluster */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[GPUs]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] == Known problems == * [[Known problems]] 884444520a499741e5f95866af1b9ef464a3699a Acknowledgements 0 29 494 454 2018-02-03T07:48:35Z Mjh 2 wikitext text/x-wiki If possible, please say explicitly that you have used the STRI cluster in any paper you publish that makes use of results obtained using it. We don't insist on any explicit form of words, but an example might be 'This work has made use of the University of Hertfordshire's high-performance computing facility.' If you wish you can add a link to <tt>http://uhhpc.herts.ac.uk/</tt>. Please also add details of any submitted, accepted or published paper using the cluster to the [[Cluster bibliography]] page. ebbcc8dfa8c37e921d908d353b4d6522575db3da 495 494 2018-02-03T07:48:50Z Mjh 2 wikitext text/x-wiki If possible, please say explicitly that you have used UH HPC in any paper you publish that makes use of results obtained using it. We don't insist on any explicit form of words, but an example might be 'This work has made use of the University of Hertfordshire's high-performance computing facility.' If you wish you can add a link to <tt>http://uhhpc.herts.ac.uk/</tt>. Please also add details of any submitted, accepted or published paper using the cluster to the [[Cluster bibliography]] page. 7fe78d3f8bb0d6998f0b0191b21712ea8736ea9d Access 0 5 496 338 2018-02-03T09:23:07Z Mjh 2 wikitext text/x-wiki == Access == The [[architecture|head node]]s of the cluster are accessible by ssh to uhhpc.herts.ac.uk, once you have an [[accounts|account]] set up. If you are working from a Unix desktop, you should be able to type <tt>ssh username@uhhpc.herts.ac.uk</tt>. If you are using Windows, you will need a Windows ssh client: we recommend PuTTY[http://www.chiark.greenend.org.uk/~sgtatham/putty/]. Unless specific authorization from the [[administrators]] is provided to the contrary, individual compute nodes must be accessed either through batch [[jobs]] or via [[interactive jobs]] run on the head node: see also the [[policies|policy]] relating to this. 9aeb56e510e15eb3c9a4f8acdf2fae17856146be 506 496 2018-03-10T18:50:43Z Mjh 2 wikitext text/x-wiki == Access == The [[architecture|head node]]s of the cluster are accessible by ssh to uhhpc.herts.ac.uk, once you have an [[accounts|account]] set up. If you are working from a Unix desktop, you should be able to type <tt>ssh username@uhhpc.herts.ac.uk</tt>. If you are using Windows, you will need a Windows ssh client: we recommend PuTTY[http://www.chiark.greenend.org.uk/~sgtatham/putty/]. Unless specific authorization from the [[administrators]] is provided to the contrary, individual compute nodes must be accessed either through batch [[jobs]] or via [[interactive jobs]] run on the head nodes: see also the [[policies|policy]] relating to this. You may not log in to compute nodes directly, or run code on the head nodes. ac03f76c9dc50fd9245838592a3df3b3f506fd82 Interactive jobs 0 35 499 269 2018-02-03T09:27:55Z Mjh 2 wikitext text/x-wiki Interactive jobs are jobs which give you an interactive session on one of the compute nodes. Importantly, accessing the compute nodes this way means that the job control system guarantees the resources that you have asked for. If you simply log into a compute node with ssh (which is in any case forbidden by the cluster access [[policies]]), another user's job may compete with what you are trying to do. Therefore you should, unless explicitly authorized otherwise, always use the interactive job facility to run interactively on the compute nodes. == Running an interactive job == An interactive job is run using the <tt>qsub</tt> command as for normal [[jobs]]. However, the result of running an interactive job is that you are logged on to the target machine. For example, <pre> [user@headnode1 ~]$ qsub -l walltime=00:30:00 -I -q main qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@node047 ~]$ </pre> In this example the user has requested a 30-minute session on any node in the main cluster (the default request is one processor on one node). A node is available to run the job and so the user finds herself at a shell prompt. She may now use the node interactively for 30 minutes. At the end of the 30 minutes (wall time, not CPU time!) the job will end and the session will be terminated. If the user logs out before then, the job will terminate early. Note that, just as with ordinary jobs, you must honour the allocation you are given. If you ask for one CPU on one node, you must only use that. In the interactive shell, a number of environment variables beginning with PBS (try <tt>printenv | grep PBS</tt>) are provided to tell you the job environment in which you are running in case you have forgotten. If your request for an interactive shell cannot be fulfilled (because there are insufficient nodes available in the queue that you have specified), the qsub command will wait until it can be. == Advanced topics == === Multiple CPUs === If you know you want a machine completely dedicated to you (e.g. because you plan to run multithreaded code interactively) then you must explicitly request that: e.g., <pre> qsub -l walltime=24:00:00 -l nodes=1:ppn=48 -I -q smp </pre> will reserve all 48 cores of one of the [[SMP machines]] for you for a day. === Multiple nodes === In the unlikely situation in which you want to run interactively on several nodes at once, you will only be given one shell. You can use the <tt>pbsdsh</tt> command to run the same commands on each of the processors you have been allocated, or /usr/local/bin/mpiexec to run [[MPI]] jobs. <pre> qsub -l walltime=00:01:00 -l nodes=2:ppn=2 -I -q smp qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@smp2 ~]$ pbsdsh hostname smp2 smp1 smp1 smp2 </pre> === Specific machines === It is possible to request a specific machine just as for normal non-interactive [[jobs]]: <pre> qsub -l walltime=00:01:00 -l nodes=smp2:ppn=48 -I -q smp </pre> Do this if you know that you need to use some particular feature of the machine, such as the [[SMP machines]]' scratch discs. === X forwarding === If you want to run jobs that use the X windows system, you may turn on X11 forwarding using the <tt>-X</tt> option. (This is like the <tt>-X</tt> option to <tt>ssh</tt>.) === Walltime requests === Please be considerate in your walltime requests. While you have an interactive job running, no other user can use the resources you have allocated. Therefore, you should not set an interactive job running for a week and walk away. Jobs that appear to be reserving resources without using them will be terminated. Obviously the best strategy is to select a walltime that roughly matches what you want and exit before the time is up. 1cda9f631819784d0552ea3b2ae40db3a74a28d5 LOFAR-UK Compute Facility 0 57 500 460 2018-02-03T09:28:36Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the UH HPC facility reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 256 GB RAM and 16 cores, plus some number (up to 12) of compute nodes mostly with 64 GB RAM and 16 cores. (Some machines have 192 or 256 GB RAM.) Reservations on the more powerful [[SMP machines]] can be made if necessary. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/data/lofar/</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive [[jobs]], as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. A description of the [[generic pipeline]] is available. 4de38ece545c858c9d4891297022cf0b6886574b Cair-cluster 0 40 501 275 2018-02-03T09:30:06Z Mjh 2 wikitext text/x-wiki == Cair data processing server == There is now a dedicated file server for cair users. The hostname is <code>cair-cluster</code> which is accessible from the private data network and the UH student network (using the FQDN <code>cair-cluster.herts.ac.uk</code>). The server is a Dell R520 with two Intel Xeon E5-2450L 1.80GHz processors and 32 GB RAM. It is connected to the "cair" InfiniBand network (192.168.4.0) via a dual-port QDR HBA. The server has ~ 77 TB of directly attached (via fibre channel) storage which has been configured to a RAID6 specification and is mounted as /cair-storage (on all cair nodes and the head node) This server can be used for post processing on large datasets. We have also enabled job submission on this server, so if preferred, cair users do not have to log on to <code>uhhpc</code> at all. a110a8349250c00bb66b135cd9df2fe5b52a1aec Policies 0 4 502 292 2018-03-10T16:26:33Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * Accounts are for use by the named user only. You must not allow anyone else to use your account. * The [[architecture|head node]]s must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing or processing of results from the compute nodes. * The only permitted method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly using ssh. If you need an interactive session on a compute node, e.g. for data processing, use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * There is a [[fair share]] policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on a timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. * If you are one of the group of people with exclusive or privileged use of a group of nodes, you must use those nodes in preference to the nodes of the main cluster if possible. You should not run jobs in the 'main' queue unless no nodes in your reserved group are available. This applies to members of CAIR and CAR. c317ecce84939ae09185440d3a5d66a85be0aa33 505 502 2018-03-10T18:49:29Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * Accounts are for use by the named user only. You must not allow anyone else to use your account. * The [[architecture|head node]]s must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing or processing of results from the compute nodes. * The only permitted method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly using ssh. If you need an interactive session on a compute node, e.g. for data processing, use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * There is a [[fair share]] policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on a timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. * If you are one of the group of people with exclusive or privileged use of a group of nodes, you must use those nodes in preference to the nodes of the main cluster if those nodes can meet your requirements. You should not run jobs in the 'main' queue unless no nodes in your reserved group are available. This applies to members of CAIR and CAR. 7fbbbf4a6eca718b073f575b6cc3b6234ae4cb78 Jobs 0 9 507 447 2018-03-10T18:54:11Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://docs.adaptivecomputing.com/torque/help.htm Torque] (formerly known as PBS) and [http://docs.adaptivecomputing.com/maui/index.php Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/qsub.htm online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=40 job.sh ## run for two hours on 40 CPUs (which may be spread over up to 40 physical nodes) qsub -l walltime=1:0:0 -l nodes=16:ppn=8 job.sh ## run for one hour on 16 nodes with 8 CPUs per node qsub -l walltime=0:1:0 -l nodes=2:chassis1:ppn=4 job2.sh ## run for one minute on 2 nodes from chassis1 with 4 CPUs per node qsub -l walltime=0:5:0 -l nodes=1:chassis1:ppn=1+1:chassis2:ppn=1 job3.sh ## run for 5 minutes on one node from chassis1 and one from chassis2 qsub -l walltime=0:0:10 -l nodes=node001:ppn=8 job4.sh ## run for 10 seconds on all CPUs of node001 qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. The exception might be if you want to guarantee that all inter-process communications take place within a chassis; see [[networking]]). Note also that in the present configuration requests for fewer processors per node than are available may be consolidated onto fewer nodes. So the third request above, run on an empty cluster, muight actually run on one physical node with 8 CPUs. This should never be a problem unless you are specifically testing inter-node communication. For efficiency, you are recommended always to ask for and use the maximum number of processes per node for MPI or multi-threaded code -- that is, 32 processes for the main cluster. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qalter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/2-jobs/exportedBatchEnvVar.htm here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/pbsdsh.htm pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 01d909f28a66ec93781271e68664378c9dec1332 Queues 0 15 508 380 2018-03-10T18:56:32Z Mjh 2 wikitext text/x-wiki There are six possible job queues available for general use on the system: * 'main' is the default queue: this submits to the 78 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 27 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'smp' submits to the [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. fe8a416e71570583c2fb442974799479ddf51f03 516 508 2018-03-10T20:50:51Z Mjh 2 wikitext text/x-wiki There are six possible job queues available for general use on the system: * 'main' is the default queue: this submits to the 78 nodes of the main cluster. The maximum wall time on this queue is 1 week. * 'gpu' submits to the GPU nodes. The maximum wall time on this queue is 48 hours. * 'smp' submits to the [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'cair_s' submits to a CAIR node dedicated to small (few CPU) jobs. This queue is restricted to CAIR users. * 'cair_l' submits to the 27 CAIR nodes with no maximum wall time. This queue is restricted to CAIR users. * 'car' submits to the 32 dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. == Default wall times == The default wall time limitation for the 'cair_s' queue is also the maximum, i.e. 6 hours. The default wall time for all the other queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. dcc0dbe5282e0a522b79d8c290f728c4477b0448 Read this first 0 70 509 2018-03-10T18:57:48Z Mjh 2 Created page with "= Introduction to cluster computing = If you are new to the concept of cluster computing, read this '''before doing anything else'''. Cluster computing is fundamentally diff..." wikitext text/x-wiki = Introduction to cluster computing = If you are new to the concept of cluster computing, read this '''before doing anything else'''. Cluster computing is fundamentally different from personal computing. If you do not understand how it works, you may cause problems for other users, or break [[Policies|cluster rules]]. The cluster is composed of 'nodes' which are individual computers, joined together by a network. Nodes have different roles. Specifically there are a few 'login nodes' or 'head nodes' (what you see when you log in to the cluster), and many (about 150) [[Architecture|compute nodes]], which are where the actual computing work is done. You do not run code on the login nodes. Rather, you write a script to run a '[[Jobs|job]]' on the compute nodes. Your script tells the cluster both what you want to do, and what resources you need to do it. Your job is then run on one or more compute nodes that match your requirements. If your code is capable of doing so, this model allows you to access many nodes simultaneously, and so to get much greater resources than your desktop or laptop computer can provide. But you need to understand it to use it. New users should read '''at least''' the following Wiki pages: * [[Accounts]] -- to find out how to get an account * [[Access]] -- to find out how to get access to the cluster * [[Architecture]] -- to find out what nodes there are * [[Jobs]] -- to find out how to run jobs on appropriate compute nodes Please don't approach the [[administrators]] for help until you have read and understood these pages. c2aae6699a4e8afbd2e56bf13f58067207540d2a 514 509 2018-03-10T20:47:12Z Mjh 2 wikitext text/x-wiki = Introduction to cluster computing = If you are new to the concept of cluster computing, read this '''before doing anything else'''. Cluster computing is fundamentally different from personal computing. If you do not understand how it works, you may cause problems for other users, or break [[Policies|cluster rules]]. The cluster is composed of 'nodes' which are individual computers, joined together by a network. Nodes have different roles. Specifically there are a few 'login nodes' or 'head nodes' (what you see when you log in to the cluster), and many (about 150) [[Architecture|compute nodes]], which are where the actual computing work is done. You do not run code on the login nodes. Rather, you write a script to run a '[[Jobs|job]]' on the compute nodes. Your script tells the cluster both what you want to do, and what resources you need to do it. Your job is then run on one or more compute nodes that match your requirements. If your code is capable of doing so, this model allows you to access many nodes simultaneously, and so to get much greater resources than your desktop or laptop computer can provide. But you need to understand it to use it. New users should read '''at least''' the following Wiki pages: * [[Accounts]] -- to find out how to get an account * [[Access]] -- to find out how to get access to the cluster * [[Architecture]] -- to find out what nodes there are * [[Jobs]] -- to find out how to run jobs on appropriate compute nodes * [[Queues]] -- to understand which queue to use * [[Storage]] -- to understand how and where to store data on the cluster Please don't approach the [[administrators]] for help until you have read and understood these pages. 01f52222538956374a9276a4bf38b4fbc488c9c5 GPUs 0 71 512 2018-03-10T20:31:58Z Mjh 2 Created page with "Several machines on the cluster have attached NVIDIA GPUs. * gpu1 is the main cluster gpu machine. It is currently the only machine in the <tt>gpu</tt> queue. The attached GP..." wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1 is the main cluster gpu machine. It is currently the only machine in the <tt>gpu</tt> queue. The attached GPUs are 6 Tesla K80 units. * smp1 has a 4 attached Tesla S2050s. These are now very old and unlikely to be much use except for testing purposes. * ramius has a single Tesla K40c. ramius is a private machine. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. 9325cb64ae7c784c9fa7c6254574f62e1b4912bb 513 512 2018-03-10T20:34:42Z Mjh 2 wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1 is the main cluster gpu machine. It is currently the only machine in the <tt>gpu</tt> queue. The attached GPUs are 6 Tesla K80 units. * smp1 has a 4 attached Tesla S2050s. These are now very old and unlikely to be much use except for testing purposes. * ramius has a single Tesla K40c. ramius is a private machine. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. It may be sensible to bind your host-side process to cores physically in sockets that are connected to the PCI bus using the Linux process affinity setting commands. This will depend on your application --- left up to users at the moment. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. a687cf141aca053139a460abe5ba8263d64ab777 WEAVE 0 72 519 2018-03-15T16:25:17Z Mjh 2 Created page with "Access to the UH HPC facility is available to members of the WEAVE consortium under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If..." wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE consortium under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE consortium you may request access by sending e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == [Dan to add] 8873efadaf96a5191c7cae21e07becd1eb027ccd 520 519 2018-03-15T16:26:50Z Mjh 2 /* Types of usage */ wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE consortium under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE consortium you may request access by sending e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == [Dan to add] 1cd281b36b3d7e0823bcbddad38bebdce9540021 AIPS 0 27 521 267 2018-03-15T17:05:23Z Mjh 2 wikitext text/x-wiki AIPS is software for radio astronomy data reduction. It is installed on the cluster in /soft/aips . To use aips you will need to be in the aipsuser group. From the head node, use an [[interactive jobs|interactive job]] to get a session on the machine you want to use. You can use any compute node or the [[SMP machines]]. Be sure to use the -X option to get X11 forwarding. Then do <tt>/soft/aips/START_AIPS tv=local da=STRI_CLUSTER tpok</tt>. You may choose your own AIPS number; if you clash with someone else, you'll probably notice. == Parseltongue == To use Parseltongue for scripting AIPS do <pre>setenv PYTHONPATH /soft/Obit/python source /soft/AIPS/LOGIN.CSH /soft/parseltongue/bin/ParselTongue </pre> bae77eab55d1f0d3ed6b1d948fe253c9f59f903c User:Dsmith 2 73 522 2018-03-16T13:37:23Z Mjh 2 Creating user page for new user. wikitext text/x-wiki My name is Dan Smith and I am an astronomer. I own four combs of different sizes, and never use any of them. Is that fifty words yet? 1+1 = 2, 6+2 = 8. My name is Dan Smith and I am an astronomer. I own four combs of different sizes, and never use any of them. Is that fifty words yet? 1+1 = 2, 6+2 = 8. e8cf40d0f60667db9a862dd5123a129c3aa73b73 User talk:Dsmith 3 74 523 2018-03-16T13:37:24Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 13:37, 16 March 2018 (UTC) d9421985af77a839f9de6afe0ee93fdb9763aae2 User:Ptaylor 2 75 524 2018-03-16T13:37:48Z Mjh 2 Creating user page for new user. wikitext text/x-wiki Based at ANU in Canberra working with Christoph Federrath, previously at UH working with Chiaki Kobayashi. Understanding galaxy evolution using cosmological simulations, with a focus on the influence of AGN feedback. I am also constantly working to improve numerical modelling of AGN feedback by incorporating results from small-scale simulations of BH-driven jets. 3b38a1ec75946c3c4c664aabd3df15eb178d8f41 User talk:Ptaylor 3 76 525 2018-03-16T13:37:49Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 13:37, 16 March 2018 (UTC) d9421985af77a839f9de6afe0ee93fdb9763aae2 WEAVE 0 72 526 520 2018-03-16T14:02:10Z Dsmith 15 wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE consortium you may request access by sending e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at PATH TBD. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, for a standard field containing 2,000 targets might look something like this: How to put an example .csh file here? For details of how to submit a job, look at the [[jobs]] page. 66adac3add8f3e98b77fa7c7730b645434050f70 527 526 2018-03-16T14:27:57Z Dsmith 15 wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at PATH TBD. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 cd /path/to/your/xml/input/file configure --gui 0 --field test_field.xml --output test_field_configured.xml END </pre> For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. fd32c1d8c5d452d3292f881acee92726ba8f39b5 528 527 2018-03-16T17:28:14Z Dsmith 15 /* Running configure */ wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at PATH TBD. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, with 8GB of memory for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 #PBS -l pmem=8gb cd /path/to/your/xml/input/file configure --gui 0 --field test_field.xml --output test_field_configured.xml END </pre> To help you estimate the resources that you're likely to need, and how this might vary dependent on the degree of clustering in your target data, take a look at [https://star.herts.ac.uk/~dsmith/configure_resources_used.jpg?dl=0 these plots] For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. 19563af7f040fd0ab87b5c927edc78259303053c 529 528 2018-03-16T21:17:37Z Mjh 2 wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). Please specify the username you would like to use (which will be allocated to you if not already in use). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at PATH TBD. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, with 8GB of memory for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 #PBS -l pmem=8gb cd /path/to/your/xml/input/file configure --gui 0 --field test_field.xml --output test_field_configured.xml </pre> To help you estimate the resources that you're likely to need, and how this might vary dependent on the degree of clustering in your target data, take a look at [https://star.herts.ac.uk/~dsmith/configure_resources_used.jpg?dl=0 these plots] For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. 2fa87fb8906b0fc12c8d67bb982813ac7b7cbaf4 530 529 2018-03-17T12:17:19Z Dsmith 15 /* Running configure */ wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). Please specify the username you would like to use (which will be allocated to you if not already in use). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at PATH TBD. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, with 8GB of memory for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 #PBS -l pmem=8gb cd /path/to/your/xml/input/file configure --gui 0 --threads 8 --field test_field.xml --output test_field_configured.xml </pre> To help you estimate the resources that you're likely to need, and how this might vary dependent on the degree of clustering in your target data, take a look at [https://star.herts.ac.uk/~dsmith/configure_resources_used.jpg?dl=0 these plots] For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. 4b523432a748ca3188a9addf1a3cce652d123e0d 531 530 2018-03-21T11:46:19Z Dsmith 15 wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending an e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). Please specify the username you would like to use (which will be allocated to you if not already in use). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at PATH TBD. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, with 8GB of memory for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 #PBS -l pmem=8gb cd /path/to/your/xml/input/file configure --gui 0 --threads 8 --field test_field.xml --output test_field_configured.xml </pre> To help you estimate the resources that you're likely to need, and how this might vary depending on the degree of clustering in your target data, take a look at [https://star.herts.ac.uk/~dsmith/configure_resources_used.jpg?dl=0 these plots] For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. 5b6cc323031b152f8aa36b6194a3603e03798ba3 533 531 2018-03-22T12:23:21Z Dsmith 15 /* Running configure */ wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending an e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). Please specify the username you would like to use (which will be allocated to you if not already in use). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at /soft/configure/configure. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, with 8GB of memory for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 #PBS -l pmem=8gb cd /path/to/your/xml/input/file /soft/configure/configure --gui 0 --threads 8 --field test_field.xml --output test_field_configured.xml </pre> To help you estimate the resources that you're likely to need, and how this might vary depending on the degree of clustering in your target data, take a look at [https://star.herts.ac.uk/~dsmith/configure_resources_used.jpg?dl=0 these plots] For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. 847fd5493dd3e810f1aa86d6f1636ad59d80f35a 534 533 2018-03-22T15:25:24Z Dsmith 15 /* Running configure */ wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending an e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). Please specify the username you would like to use (which will be allocated to you if not already in use). == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at /soft/configure/configure. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, with 8GB of memory (using the pmem command, which specifies the memory allocated per core) for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 #PBS -l pmem=1gb cd /path/to/your/xml/input/file /soft/configure/configure --gui 0 --threads 8 --field test_field.xml --output test_field_configured.xml </pre> To help you estimate the resources that you're likely to need, and how this might vary depending on the degree of clustering in your target data, take a look at [https://star.herts.ac.uk/~dsmith/configure_resources_used.jpg?dl=0 these plots] For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. 429b4585a7f312c8ed5911d7dee7d6d368fd738c 556 534 2018-06-20T19:48:28Z Mjh 2 /* Obtaining access */ wikitext text/x-wiki Access to the UH HPC facility is available to members of the WEAVE Survey Working Group (SWG) under an agreement between WEAVE and the University of Hertfordshire. == Obtaining access == If you are a member of the WEAVE SWG you may request access by sending an e-mail to Martin Hardcastle (<tt>m.j.hardcastle@herts.ac.uk</tt>). Please specify the username you would like to use (which will be allocated to you if not already in use). It will speed the process up if you could also confirm in the e-mail that you accept the [[Terms of use]] of the facility. == Terms of access == The access provided is on the basis that you will be using the facility only for work directly related to WEAVE SWG activities. It does not allow you to use the facility as a general HPC facility for other purposes. General access is only available through a UH-based collaborator. == Types of usage == By default you will be using the facility on the same terms as all other users. That means you can log on and submit jobs as described on the [[Main Page]] which will be processed as resources allow. Depending on your needs, you may want to use [[jobs|batch jobs]] or [[interactive jobs]]. If you have a particular task that you need to do where you know in advance that you will need a lot of CPU time (e.g. in the build up to a deadline for OpR3b) then you can ask for a [[reservations|reservation]] to be made for you or for the WEAVE group in general. So, for example, you could ask for two 32-core nodes to be reserved for a month and you would then have access to those nodes through the [[jobs|job control system]]. Reservations are time-limited and the start and end period of the reservation need to be agreed in advance. == Running configure == The latest version of configure is installed at /soft/configure/configure. Do not run configure on the login nodes. A basic script to run configure using 8 cores on a single compute node, with 8GB of memory (using the pmem command, which specifies the memory allocated per core) for a standard field containing 2,000 targets might look something like this: <pre> #!/bin/csh #PBS -N configure_example #PBS -q main #PBS -l nodes=1:ppn=8 #PBS -l walltime=00:30:00 #PBS -l pmem=1gb cd /path/to/your/xml/input/file /soft/configure/configure --gui 0 --threads 8 --field test_field.xml --output test_field_configured.xml </pre> To help you estimate the resources that you're likely to need, and how this might vary depending on the degree of clustering in your target data, take a look at [https://star.herts.ac.uk/~dsmith/configure_resources_used.jpg?dl=0 these plots] For details of how to submit a job, and what the other commands in this example mean, take a look at the [[jobs]] page. 4c00c86e96bed9ccc2212dafcdabb234edb91976 AIPS 0 27 532 521 2018-03-22T11:50:57Z Mjh 2 wikitext text/x-wiki AIPS is software for radio astronomy data reduction. It is installed on the cluster in /soft/aips . To use aips you will need to be in the aipsuser group. From the head node, use an [[interactive jobs|interactive job]] to get a session on the machine you want to use. You can use any compute node or the [[SMP machines]]. Be sure to use the -X option to get X11 forwarding. Then do <tt>/soft/aips/START_AIPS tv=local da=STRI_CLUSTER tpok</tt>. You may choose your own AIPS number; if you clash with someone else, you'll probably notice. == Parseltongue == To use Parseltongue for scripting AIPS do <pre>setenv PYTHONPATH /soft/Obit/python source /soft/aips/LOGIN.CSH /soft/parseltongue/bin/ParselTongue </pre> 844db4180b6832567975fc6a227bc02e8ba0030b Storage 0 8 535 515 2018-03-29T13:54:02Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 1 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 560 TB of beegfs storage nominally distributed as follows: * 180 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 144 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). fd54262b7d2364d2d5069bc3a065893e250c3a18 541 535 2018-04-30T11:26:41Z Mjh 2 /* System-wide NFS storage */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 560 TB of beegfs storage nominally distributed as follows: * 180 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 144 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 00c610a102b7693cfa625901bb11580f1e7404f6 542 541 2018-04-30T11:29:29Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 650 TB of beegfs storage nominally distributed as follows: * 270 TB: general use, under /beegfs/general * 146 TB: CAR, under /beegfs/car * 144 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 1f99d48ac255f71ad460b7a1e316b80d031a7a47 554 542 2018-05-11T08:09:31Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 740 TB of beegfs storage nominally distributed as follows: * 360 TB: general use, under /beegfs/general * 145 TB: CAR, under /beegfs/car * 145 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 99dd57eb012958b10fee1a5953665b26438391c9 565 554 2018-12-24T08:14:14Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 740 TB of beegfs storage nominally distributed as follows: * 360 TB: general use, under /beegfs/general * 145 TB: CAR, under /beegfs/car * 231 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). bc34cf55d93f845c9fb3b7d5db815ff1141472eb LOFAR-UK Compute Facility 0 57 536 500 2018-04-09T17:02:56Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the UH HPC facility reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 256 GB RAM and 16 cores, plus reservations on some number (up to 12) of compute nodes mostly with 64 GB RAM and 16 cores. (Some machines have 192 or 256 GB RAM.) Reservations on the more powerful [[SMP machines]] can be made if necessary or you can compete with other users on the main cluster. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated areas <tt>/data/lofar/</tt> or <tt>/beegfs/lofar</tt>. Data processing should generally be carried out on the compute nodes. When you have a particular project to do, ask for a reservation of enough of the LOFAR nodes to allow you to do what you need to do. When you're finished, let us know so that the reservation can be released. (Local, i.e. Herts-based, users need to submit jobs with the option <tt>-W group_list=lofar</tt> to make use of the reservation. External LOFAR-UK users do not.) All the standard cluster commands work on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive [[jobs]], as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[Herts LOFAR HBA pipeline]] is available. A description of the [[generic pipeline]] is available. b11f45f42e02f8f85f222f8128796903c9c776ef Fair share 0 39 537 263 2018-04-29T08:33:51Z Mjh 2 wikitext text/x-wiki There are various systems in place to try to make sure that everyone gets a fair share of the cluster (particularly the main cluster) and that a mixture of long and short, large and small jobs can be run. Jobs are run based on a priority assigned by the scheduler. The priority takes account of a number of factors: * Jobs that involve more CPUs intrinsically have higher priority. This is to stop single-CPU work pushing out large jobs. * Jobs that have been in the queue for longer have higher priority, up to a maximum of 64 idle jobs per user (after that, the jobs will wait in the queue but will not accrue priority) * Jobs from users who have not recently significantly used the cluster have priority over those from users who have. (This is called the fair-share mechanism: the 'fair share amount' is a weighted integral of usage over the last week, weighted towards more recent use.) In addition, by default, * no user can have more than 400 processors' worth of jobs running at once. This is intended to stop a single user monopolizing the cluster * no user can have a processor-time product that exceeds 1 week x 128 nodes running at any given time. This is intended to stop large long jobs blocking shorter jobs. These policies apply to all queues but are tailored to the main queue where the competition is highest. If they cause you problems, please contact the [[administrators]]. We are happy to review policies to try to get the fairest result for everyone, and we can relax the default requirements if you have a particular need for more resources. b5940a5858c7b60f3bfe57a72297b95c3f2d4fda 553 537 2018-05-10T10:51:57Z Mjh 2 wikitext text/x-wiki There are various systems in place to try to make sure that everyone gets a fair share of the cluster (particularly the main cluster) and that a mixture of long and short, large and small jobs can be run. Jobs are run based on a priority assigned by the scheduler. The priority takes account of a number of factors: * Jobs that involve more CPUs intrinsically have higher priority. This is to stop single-CPU work pushing out large jobs. * Jobs that have been in the queue for longer have higher priority, up to a maximum of 64 idle jobs per user (after that, the jobs will wait in the queue but will not accrue priority) * Jobs from users who have not recently significantly used the cluster have priority over those from users who have. (This is called the fair-share mechanism: the 'fair share amount' is a weighted integral of usage over the last week, weighted towards more recent use.) In addition, by default, * no user can have more than 512 processors' worth of jobs running at once. This is intended to stop a single user monopolizing the cluster * no user can have a processor-time product that exceeds 1 week x 128 cores running at any given time. This is intended to stop large long jobs blocking shorter jobs. These policies apply to all queues but are tailored to the main queue where the competition is highest. If they cause you problems, please contact the [[administrators]]. We are happy to review policies to try to get the fairest result for everyone, and we can relax the default requirements if you have a particular need for more resources. d2f6192c15c11346900054c989ed86f5a813ba76 Terms of use 0 77 538 2018-04-29T09:43:50Z Mjh 2 Created page with "Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is..." wikitext text/x-wiki Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is available to three classes of people: *# Members of the University of Hertfordshire (UH), including undergraduate students, who have a legitimate need to use the cluster facilities. Undergraduate students must apply through an academic supervisor. *# External collaborators of UH research staff, for work on projects that will directly benefit UH. *# Members of external consortia who have negotiated an agreement with UH (e.g. [[WEAVE]], [[LOFAR-UK Compute Facility|LOFAR-UK]]). * Access is conditional on abiding by the [[policies]] that govern use of the cluster. Access may be withdrawn at the discretion of the [[administrators]] in extreme cases. * Access is given to individuals. Account credentials may not be shared. An individual is solely responsible for the use made of their account. * UH will store the full name and contact e-mail address of all users. A valid, monitored e-mail address must be on record for all users. Users must notify the [[administrators]] of any change in their contact details. * UH takes no responsibility for backing up users' data. Users store data on UHHPC at their own risk. * UH makes no guarantee about the level of service provided at any given time. 52fe230325556184cd2ccff98798ab30612f58fb 539 538 2018-04-29T09:46:37Z Mjh 2 wikitext text/x-wiki Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is available to three classes of people: *# Members of the University of Hertfordshire (UH), including undergraduate students, who have a legitimate need to use the cluster facilities. Undergraduate students must apply through an academic supervisor. *# External collaborators of UH research staff, for work on projects that will directly benefit UH. *# Members of external consortia who have negotiated an agreement with UH (e.g. [[WEAVE]], [[LOFAR-UK Compute Facility|LOFAR-UK]]). * Access is conditional on abiding by the [[policies]] that govern use of the cluster. Access may be withdrawn at the discretion of the [[administrators]] in extreme cases. * Access is given to individuals. Account credentials may not be shared. An individual is solely responsible for the use made of their account. * UH will store the full name and contact e-mail address of all users. A valid, monitored e-mail address must be on record for all users. Users must notify the [[administrators]] of any change in their contact details. * UH takes no responsibility for backing up users' data. Users store data on UHHPC at their own risk. * UH makes no guarantee about the level of service provided at any given time. * Access is provided only while it is needed. Users must notify the [[administrators]] when they no longer need access. Accounts will be deleted according to the [[Account cancellation policy]]. 8313faef8e763fb4d82aa6bd7c58347243508f78 546 539 2018-05-02T13:49:02Z Mjh 2 wikitext text/x-wiki Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is available to three classes of people: *# Members of the University of Hertfordshire (UH), including undergraduate students, who have a legitimate need to use the cluster facilities. Undergraduate students must apply through an academic supervisor. *# External collaborators of UH research staff, for work on projects that will directly benefit UH. *# Members of external consortia who have negotiated an agreement with UH (e.g. [[WEAVE]], [[LOFAR-UK Compute Facility|LOFAR-UK]]). * Access is conditional on abiding by the [[policies]] that govern use of the cluster. Access may be withdrawn at the discretion of the [[administrators]] in extreme cases. * Access is given to individuals. Account credentials may not be shared. An individual is solely responsible for the use made of their account. * UH will store the full name and contact e-mail address of all users. A valid, monitored e-mail address must be on record for all users. Users must notify the [[administrators]] of any change in their contact details. Your details will be stored securely on the cluster itself and will be used both for a cluster mailing list and to contact you in case of problems with your use of the cluster. * UH takes no responsibility for backing up users' data. Users store data on UHHPC at their own risk. * UH makes no guarantee about the level of service provided at any given time. * Access is provided only while it is needed. Users must notify the [[administrators]] when they no longer need access. Accounts will be deleted according to the [[Account cancellation policy]]. a1a57b1e7ca2cba726265b6a8364071ff3da38e6 547 546 2018-05-06T21:40:11Z Mjh 2 wikitext text/x-wiki Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is available to three classes of people: *# Members of the University of Hertfordshire (UH), including undergraduate students, who have a legitimate need to use the cluster facilities. Undergraduate students must apply through an academic supervisor. *# External collaborators of UH research staff, for work on projects that will directly benefit UH. *# Members of external consortia who have negotiated an agreement with UH (e.g. [[WEAVE]], [[LOFAR-UK Compute Facility|LOFAR-UK]]). * Access is conditional on abiding by the [[policies]] that govern use of the cluster. Access may be withdrawn at the discretion of the [[administrators]] in extreme cases. * Access is given to individuals. Account credentials may not be shared. An individual is solely responsible for the use made of their account. * UH will store the full name and contact e-mail address of all users. A valid, monitored e-mail address must be on record for all users. Users must notify the [[administrators]] of any change in their contact details. Your details will be stored securely on the cluster itself and will be used both for a cluster mailing list and to contact you in case of problems with your use of the cluster. * The administrators may take whatever actions they feel necessary to ensure the continued operation and security of the facility, which may include inspecting any data or programs stored on the cluster. * UH takes no responsibility for backing up users' data. Users store data on UHHPC at their own risk. * UH makes no guarantee about the level of service provided at any given time. * Access is provided only while it is needed. Users must notify the [[administrators]] when they no longer need access. Accounts will be deleted according to the [[Account cancellation policy]]. f19da811178691b698638cd51b0c9ea4c6875108 548 547 2018-05-10T10:19:58Z Mjh 2 wikitext text/x-wiki Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is available to three classes of people: *# Members of the University of Hertfordshire (UH), including undergraduate students, who have a legitimate need to use the cluster facilities. Undergraduate students must apply through an academic supervisor. *# External collaborators of UH research staff, for work on projects that will directly benefit UH. *# Members of external consortia who have negotiated an agreement with UH (e.g. [[WEAVE]], [[LOFAR-UK Compute Facility|LOFAR-UK]]). * Access is conditional on abiding by the [[policies]] that govern use of the cluster. Access may be withdrawn at the discretion of the [[administrators]] in extreme cases. * Access is given to individuals. Account credentials may not be shared. An individual is solely responsible for the use made of their account. * UH will store the full name and contact e-mail address of all users. A valid, monitored e-mail address must be on record for all users. Users must notify the [[administrators]] of any change in their contact details. Your details will be stored securely on the cluster itself and will be used both for a cluster mailing list and to contact you in case of problems with your use of the cluster. * The administrators may take whatever actions they feel necessary for troubleshooting or to ensure the smooth operation and security of the facility, which may include inspecting any data or programs stored on the cluster. * UH takes no responsibility for backing up users' data. Users store data on UHHPC at their own risk. * UH makes no guarantee about the level of service provided at any given time. * Access is provided only while it is needed. Users must notify the [[administrators]] when they no longer need access. Accounts will be deleted according to the [[Account cancellation policy]]. 7d7928b112fedf279dcc3803ac3ce838bfc1b80b 549 548 2018-05-10T10:34:24Z Mjh 2 Protected "[[Terms of use]]" ([Edit=Allow only administrators] (indefinite) [Move=Allow only administrators] (indefinite)) wikitext text/x-wiki Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is available to three classes of people: *# Members of the University of Hertfordshire (UH), including undergraduate students, who have a legitimate need to use the cluster facilities. Undergraduate students must apply through an academic supervisor. *# External collaborators of UH research staff, for work on projects that will directly benefit UH. *# Members of external consortia who have negotiated an agreement with UH (e.g. [[WEAVE]], [[LOFAR-UK Compute Facility|LOFAR-UK]]). * Access is conditional on abiding by the [[policies]] that govern use of the cluster. Access may be withdrawn at the discretion of the [[administrators]] in extreme cases. * Access is given to individuals. Account credentials may not be shared. An individual is solely responsible for the use made of their account. * UH will store the full name and contact e-mail address of all users. A valid, monitored e-mail address must be on record for all users. Users must notify the [[administrators]] of any change in their contact details. Your details will be stored securely on the cluster itself and will be used both for a cluster mailing list and to contact you in case of problems with your use of the cluster. * The administrators may take whatever actions they feel necessary for troubleshooting or to ensure the smooth operation and security of the facility, which may include inspecting any data or programs stored on the cluster. * UH takes no responsibility for backing up users' data. Users store data on UHHPC at their own risk. * UH makes no guarantee about the level of service provided at any given time. * Access is provided only while it is needed. Users must notify the [[administrators]] when they no longer need access. Accounts will be deleted according to the [[Account cancellation policy]]. 7d7928b112fedf279dcc3803ac3ce838bfc1b80b Quota 0 38 540 256 2018-04-30T11:25:38Z Mjh 2 wikitext text/x-wiki Use of space on <tt>/home</tt> is subject to a quota system. You have a fixed amount of space that you are allowed to use: if you exceed this, you will no longer be able to create files on <tt>/home</tt>. The current default quota for all users is 50 Gb. When you reach 49 Gb, you will be warned and given a period (1 week) in which your usage should be reduced below 49 Gb; if you fail to reduce usage in this period, or if your usage reaches 50 Gb, new file creation will be blocked. The quota is ''not'' an expected reasonable use for a cluster user. You should try to keep your use of <tt>/home</tt> as low as possible. If you believe you have a specific need to exceed this quota, please contact the cluster [[administrators]]. There is no quota on the various data areas (see [[Storage]]) and these are the locations where it is appropriate to store large volumes of data. 0aed4b60f2544eed36502db8c4a3a44ccfc79b39 Jobs 0 9 543 507 2018-04-30T22:00:56Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://docs.adaptivecomputing.com/torque/help.htm Torque] (formerly known as PBS) and [http://docs.adaptivecomputing.com/maui/index.php Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/qsub.htm online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=64 job.sh ## run for two hours on 64 CPUs (which may be spread over up to 64 physical nodes) qsub -l walltime=1:0:0 -l nodes=2:ppn=32 job.sh ## run for one hour on 2 nodes with 32 CPUs per node qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes on one node for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. Do not attempt to use these options unless you know what you are doing. Note also that in the present configuration requests for fewer processors per node than are available may be consolidated onto fewer nodes. This should never be a problem unless you are specifically testing inter-node communication, in which case you could specify node names to force running on different nodes. For efficiency, you are recommended always to ask for and use the maximum number of processes per node for MPI or multi-threaded code -- that is, 32 processes for the main cluster. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). The MAUI tool <tt>showq</tt> can also be used to get a view of the whole queue, which is sometimes more user-friendly: <pre> /usr/local/maui/bin/showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 1765 mjh Running 128 1:15:20 Fri May 7 13:38:20 1766 mjh Running 128 1:15:20 Fri May 7 13:38:20 1767 mjh Running 128 1:15:20 Fri May 7 13:38:20 1768 mjh Running 128 1:15:20 Fri May 7 13:38:20 1769 mjh Running 128 1:15:20 Fri May 7 13:38:20 5 Active Jobs 640 of 640 Processors Active (100.00%) 80 of 80 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 1770 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1771 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1772 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1773 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1774 mjh Idle 128 5:00:00 Fri May 7 13:38:20 1775 mjh Idle 128 5:00:00 Fri May 7 13:38:20 6 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 11 Active Jobs: 5 Idle Jobs: 6 Blocked Jobs: 0 </pre> Finally, you can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qalter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/2-jobs/exportedBatchEnvVar.htm here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/pbsdsh.htm pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). aad3bca4dfba71e2377a3c1830f0053e04e11cf7 Architecture 0 7 544 518 2018-04-30T22:05:10Z Mjh 2 /* compute nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: old login node, now deprecated == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis5-9 are populated. The newer compute nodes, which will be used by most people, are in two racks, rack1 and rack2. The smp machines are outside any particular rack or chassis. * 64 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-064: rack1, rack2), in the main queue * 16 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 12 GB RAM and QDR Infiniband form part of the CAIR cluster (node065-80: chassis 5), in the cair_l queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAIR cluster (node081-92: chassis 6), in the cair_l and cair_s queues * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-9, file servers providing the BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 72aeb4df24dd68d7e0fc40e646539cc8f044ce78 563 544 2018-12-24T08:11:40Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: web and job servers == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis5-9 are populated. The newer compute nodes, which will be used by most people, are in two racks, rack1 and rack2. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-9, file servers providing the BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 1ba9683c3561bbd028aeb34f7322e8dff15f5036 564 563 2018-12-24T08:12:02Z Mjh 2 /* Servers and dedicated login nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, stri-cluster, with 2 4-core Xeons and 32 GB RAM: web and job servers == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis5-9 are populated. The newer compute nodes, which will be used by most people, are in two racks, rack1 and rack2. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-10, file servers providing the BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg f57da34d4263958e3f345bc4e7a50fee7fcfc347 Networking 0 10 545 463 2018-04-30T22:12:21Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. The links connecting the head node and the SMP machines, and the uplinks from the two node internal ethernet stacks, are carried by multiple physical cables via link aggregation. A seperate physical ethernet network carries management traffic. The infiniband network is slightly more complex. Each chassis or rack (see [[Architecture]] has an internal infiniband switch and these are all linked via two main infiniband switches, one FDR14, one QDR. The main cluster nodes in rack1 and rack2 use FDR14 infiniband (56 Gb/s), and chassis9 uses FDR10 (40 GB/s); all other machines on the network have QDR infiniband cards (40 GB/s). The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency is lower and data transfer rates are somewhat higher between nodes in the same chassis or rack than between different chassis, and ethernet connections are higher-latency and lower-bandwidth still. Best results will be obtained for IPC if jobs run in the same chassis or rack. The scheduler is aware of this and will try to ensure that jobs do not span more than one rack or chassis. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The SMP machines have addresses smp1.data, smp1.infi etc. f6856ca713a195c7c9a950450f4b222a3a0b8bc1 Administrators 0 6 550 483 2018-05-10T10:37:43Z Mjh 2 wikitext text/x-wiki == Administrators == These are currently: * Vito Graffagnino, v.graffagnino@herts.ac.uk (x3358, room 1E71 Innovation Centre). * Martin Hardcastle, m.j.hardcastle@herts.ac.uk (x3409, room 2E71 Innovation Centre). UH staff and students should see Vito (or failing that Martin) to get an account. External consortium users should contact Martin in the first instance. 654fae6d371ca7ebef2a3defb258c59ddab39027 Accounts 0 3 551 479 2018-05-10T10:48:36Z Mjh 2 wikitext text/x-wiki To get an account, contact the [[administrators]]. Accounts are available to all staff and research students of UH, and to others by special arrangement. Access is granted subject to the [[Terms of use]] of the cluster and to observance of our usage [[policies]]. Accounts may be suspended or cancelled if the cluster is abused: see also our [[Account cancellation policy]]. 059676120394ae0ef21938d682920f4cd8fd0a2e Main Page 0 1 552 510 2018-05-10T10:50:22Z Mjh 2 /* Cluster basics */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Terms of use]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[GPUs]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] == Known problems == * [[Known problems]] 889281d9ac3c941ae1ed684a0ddd013b2f23d8a9 Why doesn't my job run? 0 37 555 353 2018-05-11T16:29:38Z Mjh 2 wikitext text/x-wiki If you submit a [[jobs|job]] and it stays in the queue, it may not be clear why nothing is happening. Obviously one possibility is that there is a problem with the cluster: but it is also possible that things are going on that you are not seeing. To ask the scheduler what is going on with a job you can use the command <tt>/usr/local/maui/bin/checkjob</tt>. A typical <tt>checkjob</tt> run might look like this (note the <tt>-v</tt> option): <pre> /usr/local/maui/bin/checkjob -v 123456 checking job 123456 (RM job '123456.stri-cluster.herts.ac.uk') State: Idle Creds: user:fred group:fred class:main qos:DEFAULT WallTime: 00:00:00 of 7:00:00:00 SubmitTime: Fri Jul 8 09:04:48 (Time Queued Total: 00:38:52 Eligible: 00:38:52) Total Tasks: 24 Req[0] TaskCount: 24 Partition: ALL Network: [NONE] Memory >= 1024M Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [main] Exec: '' ExecSize: 0 ImageSize: 0 Dedicated Resources Per Task: PROCS: 1 MEM: 1024M NodeAccess: SHARED TasksPerNode: 8 NodeCount: 3 IWD: [NONE] Executable: [NONE] Bypass: 63 StartCount: 0 PartitionMask: [ALL] Flags: RESTARTABLE PE: 24.00 StartPriority: 2513 job cannot run in partition DEFAULT (idle procs do not meet requirements : 0 of 24 procs found) idle procs: 732 feasible procs: 0 Rejection Reasons: [Features : 58][CPU : 22][State : 18][ReserveTime : 8] Detailed Node Availability Information: node001 rejected : ReserveTime node002 rejected : ReserveTime node003 rejected : ReserveTime node004 rejected : State node005 rejected : ReserveTime node006 rejected : ReserveTime node007 rejected : ReserveTime node008 rejected : ReserveTime node009 rejected : ReserveTime node010 rejected : CPU node011 rejected : CPU node012 rejected : CPU node013 rejected : State node014 rejected : CPU node015 rejected : CPU node016 rejected : CPU node017 rejected : State node018 rejected : State node019 rejected : State node020 rejected : State node021 rejected : State node022 rejected : State node023 rejected : State node024 rejected : State node025 rejected : State node026 rejected : State node027 rejected : State node028 rejected : State node029 rejected : State node030 rejected : State node031 rejected : State node032 rejected : CPU node033 rejected : CPU node034 rejected : CPU node035 rejected : CPU node036 rejected : CPU node037 rejected : CPU node038 rejected : CPU node039 rejected : CPU node040 rejected : CPU node041 rejected : State node042 rejected : CPU node043 rejected : CPU node044 rejected : CPU node045 rejected : CPU node046 rejected : CPU node047 rejected : CPU node048 rejected : CPU node049 rejected : Features node050 rejected : Features node051 rejected : Features node052 rejected : Features node053 rejected : Features node054 rejected : Features node055 rejected : Features node056 rejected : Features node057 rejected : Features node058 rejected : Features node059 rejected : Features node060 rejected : Features node061 rejected : Features node062 rejected : Features node063 rejected : Features node064 rejected : Features node065 rejected : Features node066 rejected : Features node067 rejected : Features node068 rejected : Features node069 rejected : Features node070 rejected : Features node071 rejected : Features node072 rejected : Features node073 rejected : Features node074 rejected : Features node075 rejected : Features node076 rejected : Features node077 rejected : Features node078 rejected : Features node079 rejected : Features node080 rejected : Features sandbox1 rejected : Features sandbox2 rejected : Features sandbox3 rejected : Features sandbox4 rejected : Features sandbox5 rejected : Features sandbox6 rejected : Features sandbox7 rejected : Features sandbox8 rejected : Features sandbox9 rejected : Features sandbox10 rejected : Features node081 rejected : Features node082 rejected : Features node083 rejected : Features node084 rejected : Features node085 rejected : Features node086 rejected : Features node087 rejected : Features node088 rejected : Features node089 rejected : Features node090 rejected : Features node091 rejected : Features node092 rejected : Features node093 rejected : Features node094 rejected : Features node095 rejected : Features node096 rejected : Features job cannot run in partition SMP (insufficient idle procs available: 0 < 24) </pre> How do you interpret all this output? First of all, if your output does not look anything like this, for example if you see an error message about not being able to contact a server, then the scheduler is not running or there is some other problem: see [[Known problems]]. If you need help in this situation, contact one of the [[administrators]]. Assuming your output looks like the above, first of all you should check that the details of your job agree with what you think you submitted. Check <tt>NodeCount</tt> and <tt>TasksPerNode</tt> and the <tt>WallTime</tt> request. You may also want to check the output of <tt>qstat -f <jobid></tt>. Now go down to the reason why <tt>job cannot run in partition DEFAULT</tt>. Normally, this will be as above: this is telling you that there are not enough available CPUs for your job. But suppose you think the cluster is close to empty. Why is this? Look down the list of individual nodes to see why each one is rejecting your job. The example above shows the common reasons: * Features: this means that you have asked for a feature that the node in question doesn't have. For example, jobs submitted to the main cluster will never run on the [[sandbox]] machines. It is normal for many nodes to reject your job for this reason. * State: usually this means that the node in question is marked busy by the system, which means that it is running as many jobs as possible. In this example, there are a number of busy nodes. It can also mean that nodes are down, i.e. that there is a real problem. If many nodes reject your job because of 'State' but you think they cannot be busy, check <tt>pbsnodes -l</tt> to see if they are 'down' and report a problem if so. Nodes that are 'offline' in <tt>pbsnodes -l</tt> have been taken offline by the administrators for maintenance and there is no need to report them unless you think this is an error. * CPU: the node is not busy, but it has less available CPU than you have requested. This is most likely because it is running one or more jobs. In the example above, the user has asked for 3 nodes with 8 CPUs each: the nodes rejecting the job because of CPU do not have 8 free CPUs. (If all nodes are rejecting your job for this reason, you should check whether you are asking for an impossible configuration. Jobs submitted to the main cluster asking for more than 32 cores per node will sit there forever waiting for a machine with a suitable number of cores to become available. Does your job really need 3 machines with 8 CPUs, or would it work with 24 CPUs distributed over the cluster? * ReserveTime: your job asks to run on the node at a time when it is reserved for another purpose. This could be due to another job ahead of you in the queue, or it could be a system reservation. If you see this message but there are no jobs ahead of you in the queue, check your e-mail for a recent e-mail from the adminstrators regarding scheduled downtime. If you are seeing all of these and you can convince yourself that the individual nodes are rejecting your jobs for valid reasons, then the only thing you can do is wait for the conditions that are blocking your job to clear. Only if you can't convince yourself of this should you contact the administrators. 0f6f69e1c7c68e01bfdff4b85fbb99a4e1745262 Cluster bibliography 0 30 557 476 2018-07-25T09:24:53Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''), Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011''') Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands ('''2011'''), ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 0b8fb4ce3803ca9c11be84c233c4c1cb9518bcf7 559 557 2018-07-27T05:26:28Z Ptaylor 16 Updated with P Taylor's papers. wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''), Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011''') Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands ('''2011'''), ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 1255bdc910ca743a430557203b098e6af177cdb4 Gromacs 0 19 558 477 2018-07-25T09:50:04Z Akukol 3 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local LINUX machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use Gromacs version 2018. In order to run Gromacs on the headnode for preparation, it is a good idea to put the following into you .cshrc file: source /soft/gromacs-2018/bin/GMXRC (or /soft/gromacs-2018-gpu/bin/GMXRC for GPU preparation) export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="/soft/gcc-6.4/lib64:${LD_LIBRARY_PATH}" 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Look here for [[groperform|optimising performance]]. There are two version of Gromacs 2018.2 for GPU and non-GPU located in /soft/gromacs-2018 and /soft/gromacs-2018-gpu Note that all GPUs attached to the node are used automatically. The maximum walltime is 48 hours. [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] ''Andreas/Hershna'' '''For GPU:''' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q gpu #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -k oe #PBS -u hpatel # runs a job with name 'GromacsTest' on the gpu machine on the cluster # uses 1 GPU machine # set a maximum time of forty eight hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'hpatel' # set required paths: source /soft/gromacs-2018-gpu/bin/GMXRC # specify working directory: cd /home/hpatel/gromacsGPU export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="/soft/gcc-6.4/lib64:${LD_LIBRARY_PATH}" ### This is the command ### gmx mdrun -s run.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> -------------- For non-GPU use, Gromacs is optimised for the newer nodes that contain 32 cores. In order to make sure that the job runs on these nodes, you have to request them with #PBS -l nodes=1:ppn=32. An example of a job script is shown below: '''Without use of GPU:''' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=32 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -k oe #PBS -u hpatel # runs a job with name 'GromacsTest' on the main cluster # set a maximum time of forty eight hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'hpatel' # set required paths: source /soft/gromacs-2018/bin/GMXRC # specify working directory: cd /home/hpatel/gromacs export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="/soft/gcc-6.4/lib64:${LD_LIBRARY_PATH}" ### This is the command ### gmx mdrun -s run.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> 32806858290e126aa813ebee02c3892482fa8354 LOFAR 0 47 560 451 2018-09-30T08:43:44Z Mjh 2 wikitext text/x-wiki You will need a <tt>.casarc</tt> file, something like this should do: <pre> measures.DE200.directory: /soft/casapy/data/ephemerides measures.DE405.directory: /soft/casapy/data/ephemerides measures.line.directory: /soft/casapy/data/ephemerides measures.sources.directory: /soft/casapy/data/ephemerides measures.comet.directory: /soft/casapy/data/ephemerides measures.ierseop97.directory: /soft/casapy/data/geodetic measures.ierspredict.directory: /soft/casapy/data/geodetic measures.tai_utc.directory: /soft/casapy/data/geodetic measures.igrf.directory: /soft/casapy/data/geodetic measures.observatory.directory: /soft/casapy/data/geodetic </pre> Then a full setup for up-to-date versions of the LOFAR software looks something like this: <pre> bash source /soft/lofar-270618/init.sh </pre> The LOFAR software is frequently updated. For the most up-to-date versions look for /soft/lofar-date where date is a numeric build date. Then source /soft/lofar-date/lofarinit.csh instead. 9cd5009eb1d9202410f651e2099df2b145653d30 Software 0 17 561 471 2018-10-05T07:19:13Z Mjh 2 wikitext text/x-wiki This page documents locations of software. Detailed local documentation of software should go in a page specific to that software. * <u>[[Gromacs]]</u>: 2016 (with GPU acceleration) installed in <tt>/soft/gromacs-2016-gpu</tt> * Autodock <u>[[Vina]]</u>: 1.1.1 installed in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * <u>[[Autodock]]</u>: 4.2 installed in <tt>/soft/autodock</tt> * <u>[[iGemDock]]</u>: 2.1 installed in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * <u>[[AIPS]]</u>: 31DEC10 installed in <tt>/soft/aips</tt> * <u>[[CASA]]</u>: installed in <tt>/soft/casapy...</tt> * <u>[[IDL]]</u>: in <tt>/soft/idl/idl/bin</tt> * <u>[[Matlab]]</u>: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * <u>[[Starlink]]</u>: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * <u>[[LOFAR]]</u>: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * <u>[[Python packages]]</u>: set your PYTHONPATH to include <tt>/soft/python/lib64/python2.7/site-packages</tt> * <u>[[neuron]]</u>: in <tt> /soft/nrn</tt> * <u>[[Miriad]]</u>: in <tt> /soft/miriad</tt> * <u>[[ciao]]</u>: in <tt>/soft/ciao-x.x</tt> cf7c44178b56e87b7d5be6e46d16be4b16ee72c6 To do 0 78 562 2018-12-24T08:08:52Z Mjh 2 Created page with "= To do, downtime early 2019 = * Node software update * Firmware update node001-080 * Replace old stri-cluster * Infiniband tidy * Remove remaining 2TB dot hill kit * beegfs..." wikitext text/x-wiki = To do, downtime early 2019 = * Node software update * Firmware update node001-080 * Replace old stri-cluster * Infiniband tidy * Remove remaining 2TB dot hill kit * beegfs servers to new location * beegfs upgrade to v7 * Tidy stack 3 * node070 hardware issue * labelling * torque upgrade? 6b1bbba01b726cffc1a9d21890de36cb4b82a6de 566 562 2019-01-01T19:11:02Z Mjh 2 /* To do, downtime early 2019 */ wikitext text/x-wiki = To do, downtime early 2019 = * Node software update * Firmware update node001-080 * Replace old stri-cluster * Infiniband tidy * Remove remaining 2TB dot hill kit * beegfs servers to new location * beegfs upgrade to v7 * Tidy stack 3 * node070 hardware issue * labelling * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes 1dc946cf09bc363efcce012a6e3923307253ee88 568 566 2019-01-28T11:28:40Z Mjh 2 /* To do, downtime early 2019 */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * Infiniband tidy * Remove remaining 2TB dot hill kit * beegfs servers to new location * Tidy stack 3 * node070 hardware issue * labelling == Software == * Node software update * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes 38747b08431438a6a307dd4fa1f9d2839c7696db 569 568 2019-01-28T11:43:39Z Mjh 2 /* To do, downtime early 2019 */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * Infiniband tidy * beegfs servers to new location * Tidy stack 3 * node070 hardware issue * labelling == Software == * Node software update * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes 38d3d857856e471c0551db66a62fc525fa51e0f6 570 569 2019-01-28T12:05:14Z Mjh 2 /* Hardware */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * Infiniband tidy * beegfs servers to new location * Tidy stack 3 * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes 6d5be3d9026943324a5b2e03840f42d290f63786 571 570 2019-01-28T12:05:32Z Mjh 2 /* Software */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * Infiniband tidy * beegfs servers to new location * Tidy stack 3 * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 upgrade * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes 20eccc61059f45927a4b44be64802c8ef9860135 572 571 2019-01-28T15:10:30Z Mjh 2 /* To do, downtime early 2019 */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * Tidy stack 3 * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes cd3bc17bbd979be56754d134e03d0319a1f08168 573 572 2019-01-28T16:27:41Z Mjh 2 /* To do, downtime early 2019 */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * Tidy stack 3 * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * log in to node * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. f7840dc18c3e2601bae89182effec7199a364f09 Python packages 0 49 567 347 2019-01-11T14:55:11Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * astropy [http://pypi.python.org/packages/source/a/astropy/astropy-0.2.4.tar.gz] * kapteyn [http://www.astro.rug.nl/software/kapteyn/] * h5py * mpi4py * hcluster Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these available. Please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>. 7b8be0dfba5909f5783268f54aa6605e0f463e5e To do 0 78 574 573 2019-01-28T16:39:44Z Mjh 2 /* Notes */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * Tidy stack 3 * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * log in to nodexxx.management * select 'launch virtual console' * if popups blocked, allow and launch again * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. 7d13b064f1941ff949062f76acba5e2cb22a56e4 575 574 2019-01-28T16:44:26Z Mjh 2 /* Hardware */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * log in to nodexxx.management * select 'launch virtual console' * if popups blocked, allow and launch again * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. f79c4e761c5455fa851c5de4765efacd12ef81cd 576 575 2019-01-29T08:33:07Z Mjh 2 /* Notes */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. 964b840d57d4e836116a2972bc21847b8e955850 577 576 2019-01-29T17:41:20Z Mjh 2 wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * DNS * NTP server/broadcast * Torque * Maui * Web server * Mariadb * haproxy ssh a1b6b415617e7fa4f1283dd703bf0c558ac1e613 578 577 2019-01-29T17:42:32Z Mjh 2 /* Services that need to work on new head node */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * DNS * NTP server/broadcast * Torque * Maui * Web server * Mariadb * haproxy ssh * Munin * Ganglia 1d389a46e0d84337ff630ec3a6f93560fbbcd4c0 579 578 2019-01-29T17:47:11Z Mjh 2 /* Services that need to work on new head node */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server</s>/broadcast * Torque * Maui * Web server * Mariadb * haproxy ssh * Munin * Ganglia 38e387e38c802992fdb1abe0cc6010d47bec901d 580 579 2019-01-30T17:41:48Z Mjh 2 wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server</s>/broadcast * <s>Torque</s> * <s>Maui</s> * <s>Web server</s> * <s>Mariadb</s> * haproxy ssh * Munin * Ganglia * DHCP 92789397138b3f40e033336ae1b6a7f473553dd3 585 580 2019-01-31T14:22:28Z Mjh 2 /* Services that need to work on new head node */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * Node software update * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * torque upgrade? * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server</s>/broadcast * <s>Torque</s> * <s>Maui</s> * <s>Web server</s> * <s>Mariadb</s> * <s>haproxy ssh</s> * Munin * Ganglia * <s>DHCP</s> * <s>NFS</s> d343b64da77e23f72735bbd3786fd045b0f066d4 586 585 2019-01-31T14:22:59Z Mjh 2 /* Software */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * <s>Node software update</s> * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * <s>torque upgrade</s> * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server</s>/broadcast * <s>Torque</s> * <s>Maui</s> * <s>Web server</s> * <s>Mariadb</s> * <s>haproxy ssh</s> * Munin * Ganglia * <s>DHCP</s> * <s>NFS</s> a214b167b46fc51389d7053671d56508effe7a69 587 586 2019-01-31T14:23:28Z Mjh 2 /* Services that need to work on new head node */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * Replace old stri-cluster * beegfs servers to new location * node070 hardware issue -- DIMM B6 * labelling == Software == * <s>Node software update</s> * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * <s>torque upgrade</s> * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server/broadcast</s> * <s>Torque</s> * <s>Maui</s> * <s>Web server</s> * <s>Mariadb</s> * <s>haproxy ssh</s> * <s>LDAP</s> * Munin * Ganglia * <s>DHCP</s> * <s>NFS</s> 7e67509c07cc2e5a93fe1c40253c700dbda8a8cb 588 587 2019-01-31T18:52:00Z Mjh 2 /* Hardware */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * <s>Replace old stri-cluster</s> * beegfs servers to new location * <s>node070 hardware issue -- DIMM B6</s> * labelling == Software == * <s>Node software update</s> * node024 reinstall * Firmware update node001-080 * beegfs upgrade to v7 * <s>torque upgrade</s> * all new nodes to LOFAR-capable * sort out fstabs on all nodes == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server/broadcast</s> * <s>Torque</s> * <s>Maui</s> * <s>Web server</s> * <s>Mariadb</s> * <s>haproxy ssh</s> * <s>LDAP</s> * Munin * Ganglia * <s>DHCP</s> * <s>NFS</s> 35ec31c784f9f611344d443b4f551734dc58ac19 589 588 2019-01-31T18:52:39Z Mjh 2 /* Software */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * <s>Replace old stri-cluster</s> * beegfs servers to new location * <s>node070 hardware issue -- DIMM B6</s> * labelling == Software == * <s>Node software update</s> * node024 reinstall * <s>Firmware update node001-080</s> * <s>beegfs upgrade to v7</s> * <s>torque upgrade</s> * <s>all new nodes to LOFAR-capable</s> * <s>sort out fstabs on all nodes</s> == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server/broadcast</s> * <s>Torque</s> * <s>Maui</s> * <s>Web server</s> * <s>Mariadb</s> * <s>haproxy ssh</s> * <s>LDAP</s> * Munin * Ganglia * <s>DHCP</s> * <s>NFS</s> be79fb95241bdb8a5ca4b6fd223d2056d66df5fd 590 589 2019-02-01T13:07:47Z Mjh 2 /* To do, downtime early 2019 */ wikitext text/x-wiki = To do, downtime early 2019 = == Hardware == * <s>Replace old stri-cluster</s> * beegfs servers to new location * <s>node070 hardware issue -- DIMM B6</s> * labelling == Software == * <s>Node software update</s> * <s>node024 fix</s> (but not updated firmware) * <s>Firmware update node001-080</s> * <s>beegfs upgrade to v7</s> * <s>torque upgrade</s> * <s>all new nodes to LOFAR-capable</s> * <s>sort out fstabs on all nodes</s> == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == Services that need to work on new head node == * <s>DNS</s> * <s>NTP server/broadcast</s> * <s>Torque</s> * <s>Maui</s> * <s>Web server</s> * <s>Mariadb</s> * <s>haproxy ssh</s> * <s>LDAP</s> * Munin * Ganglia * Licence servers * <s>DHCP</s> * <s>NFS</s> df02d4b5692a1d1fccdd6e7020ce1514930dbaf6 Queues 0 15 581 516 2019-01-30T21:20:48Z Mjh 2 wikitext text/x-wiki There are six possible job queues available for general use on the system: * 'main' is the default queue: this submits to the main cluster. The maximum wall time on this queue is 1 week. * 'gpu' submits to the GPU nodes. The maximum wall time on this queue is 48 hours. * 'smp' submits to the [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'cair_l' submits to the dedicated CAIR nodes. This queue is restricted to CAIR users. * 'car' submits to the dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. * 'forecast' submits to the dedicated air quality forecast nodes. == Default wall times == The default wall time for all the queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. 2dad18acdc355d1ba066bc821fe371789f311324 Architecture 0 7 582 564 2019-01-31T09:51:43Z Mjh 2 /* head/login nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis5-9 are populated. The newer compute nodes, which will be used by most people, are in two racks, rack1 and rack2. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-10, file servers providing the BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 9d8f84c2156de54cefd5212f9faf9754cbc61e91 583 582 2019-01-31T09:52:30Z Mjh 2 /* compute nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis5-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-10, file servers providing the BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg ff64ce9e34d00cd76c7ed2a4511a0dcba76189c0 584 583 2019-01-31T09:53:33Z Mjh 2 /* Servers and dedicated login nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis5-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-10, file servers providing 830 TB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 99675ce8e597e0a24d455581609840c88671f0e3 Singularity 0 79 591 2019-03-02T17:51:53Z Mjh 2 Created page with "Singularity ([https://github.com/sylabs/singularity]) is installed on the cluster. This is our preferred way of running containerized applications, including Docker containers..." wikitext text/x-wiki Singularity ([https://github.com/sylabs/singularity]) is installed on the cluster. This is our preferred way of running containerized applications, including Docker containers. You need /soft/bin on your path to run singularity. You probably want to use the --bind option to bind data directories such as beegfs. db2b92247647fd13c3bd0f3189887335ba1cda4e 603 591 2019-10-10T09:27:03Z Mjh 2 wikitext text/x-wiki Singularity ([https://github.com/sylabs/singularity]) is installed on the cluster. This is our preferred way of running containerized applications, including Docker containers. You need /soft/bin on your path to run singularity. You probably want to use the --bind option to bind data directories such as beegfs. Note that singularity images can't be built on BeeGFS (they can be stored there once built). This will affect users converting from Docker images. If this causes you problems please contact the [[administrators]]. 4da1cb60267910271ff7e00acbbc9718383fc536 Local disk space 0 48 592 341 2019-03-06T10:53:31Z Mjh 2 wikitext text/x-wiki The main compute nodes have a limited amount of local disk space (around 700 Gb for nodes001-080 and 110 Gb for nodes081-144). This area is mounted on /local and is only visible internally to the node. The normal expectation is that you will use the central [[storage]] for file operations, including data processing. However, there may be circumstances where it will be useful to copy some data to the nodes, do some I/O intensive operations on it and copy it back to the storage. In this case you may use the /local area. If you want to do this, to avoid interfering with other jobs: * You ''must'' reserve the maximum amount of space that your job will use using the <tt>file</tt> option to <tt>qsub</tt>; e.g. <pre>qsub -l nodes=1,file=10gb</pre> * You must create a directory in /local in which your job will work as part of your <tt>qsub</tt> script, which will be unique to your job. For example, you might want to do <pre> mkdir /local/$PBS_JOBID cd /local/$PBS_JOBID </pre> * You must only work in this directory, and the total filespace you use must not exceed the reserved amount. * When your job is finished it must before exiting clear up the filespace used; no files must be left in <tt>/local</tt>. Note that these rules do not apply to the <tt>/scratch</tt> directories on the [[SMP machines]]. dfbbbd4421cf03e278123f18b097fbd2c2c9920a Cluster bibliography 0 30 593 559 2019-04-12T09:40:31Z Ptaylor 16 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Taylor, P., Kobayashi, C., Federrath, C. The metallicity and elemental abundance maps of kinematically atypical galaxies for constraining minor merger and accretion histories. '''2019'''. MNRAS, 485, 3215 * Wang, E.X., Taylor, P., Federrath, C., Kobayashi, C. The impact of black hole seeding in cosmological simulations. '''2019'''. MNRAS, 483, 4640 * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''), Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011''') Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands ('''2011'''), ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 d59357ee41959977210e4fd0224679e1d3ad56a3 604 593 2019-10-18T08:27:51Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Patel, H., & Kukol, A. (2019). Prediction of ligands to universally conserved binding sites of the influenza A virus nuclear export protein. Virology, 537, 97-103. * Taylor, P., Kobayashi, C., Federrath, C. The metallicity and elemental abundance maps of kinematically atypical galaxies for constraining minor merger and accretion histories. '''2019'''. MNRAS, 485, 3215 * Wang, E.X., Taylor, P., Federrath, C., Kobayashi, C. The impact of black hole seeding in cosmological simulations. '''2019'''. MNRAS, 483, 4640 * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''), Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011''') Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands ('''2011'''), ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 1844071043499bc6b3f64745d85f6e7447907dae 605 604 2019-10-18T08:29:32Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Patel, H., & Kukol, A. ('''2019'''). Prediction of ligands to universally conserved binding sites of the influenza A virus nuclear export protein. Virology, 537, 97-103. * Taylor, P., Kobayashi, C., Federrath, C. The metallicity and elemental abundance maps of kinematically atypical galaxies for constraining minor merger and accretion histories. '''2019'''. MNRAS, 485, 3215 * Wang, E.X., Taylor, P., Federrath, C., Kobayashi, C. The impact of black hole seeding in cosmological simulations. '''2019'''. MNRAS, 483, 4640 * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''), Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011''') Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A, Consensus virtual screening approaches to predict protein ligands ('''2011'''), ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 13c24bb1d527a21b31c810b89b279fd9c0f20a0d 606 605 2019-10-18T08:31:21Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Patel, H., & Kukol, A. ('''2019'''). Prediction of ligands to universally conserved binding sites of the influenza A virus nuclear export protein. Virology, 537, 97-103. * Taylor, P., Kobayashi, C., Federrath, C. The metallicity and elemental abundance maps of kinematically atypical galaxies for constraining minor merger and accretion histories. '''2019'''. MNRAS, 485, 3215 * Wang, E.X., Taylor, P., Federrath, C., Kobayashi, C. The impact of black hole seeding in cosmological simulations. '''2019'''. MNRAS, 483, 4640 * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''), Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011''') Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A ('''2011''') Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 7cf98feef2db6b80a67329e24bd976e670d47953 607 606 2019-10-18T08:32:07Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Patel, H., & Kukol, A. ('''2019'''). Prediction of ligands to universally conserved binding sites of the influenza A virus nuclear export protein. Virology, 537, 97-103. * Taylor, P., Kobayashi, C., Federrath, C. The metallicity and elemental abundance maps of kinematically atypical galaxies for constraining minor merger and accretion histories. '''2019'''. MNRAS, 485, 3215 * Wang, E.X., Taylor, P., Federrath, C., Kobayashi, C. The impact of black hole seeding in cosmological simulations. '''2019'''. MNRAS, 483, 4640 * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''). Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011'''.) Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A ('''2011'''). Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 006df77d674a734de91ff5eb385ad5895328bb10 608 607 2019-10-18T08:33:36Z Akukol 3 wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Patel, H., & Kukol, A. ('''2019'''). Prediction of ligands to universally conserved binding sites of the influenza A virus nuclear export protein. ''Virology'', 537, 97-103. * Taylor, P., Kobayashi, C., Federrath, C. The metallicity and elemental abundance maps of kinematically atypical galaxies for constraining minor merger and accretion histories. '''2019'''. MNRAS, 485, 3215 * Wang, E.X., Taylor, P., Federrath, C., Kobayashi, C. The impact of black hole seeding in cosmological simulations. '''2019'''. MNRAS, 483, 4640 * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''). Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011'''.) Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A ('''2011'''). Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 e8927305a38de88ae9a8f149b1203c42db7789bd MPI 0 12 594 310 2019-05-14T19:50:46Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Ethernet network, and so the latency and total available bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/lib64/mpich2/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/lib64/mpich2/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). === MVAPICH2 === MVAPICH2 uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. This does not work on the smp machines (hardware incompatible), use e.g. MPICH2 there instead or use the main queue. To use MVAPICH2 do <pre> module unload mpi/mpich-x86_64 module load mvapich2 </pre> Then you should see <pre> > which mpicc /usr/mpi/gcc/mvapich2-2.1a/bin/mpicc </pre> <tt></usr/mpi/gcc/mvapich2-2.1a/bin/mpiexec/tt> works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/mpi/gcc/mvapich2-2.1a/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === OpenMPI === <tt>module load mpi/openmpi-x86_64</tt> and then proceed as above. 60b09cf46f96e7dd812d0a15174a6bf84cacb64a 595 594 2019-05-14T19:52:06Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster. All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All use of MPI on the cluster must be integrated with the [[jobs|job control system]] and this page describes how to do that. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! === MPICH2 === MPICH2 is the default implementation since it is provided as a package within Fedora. If you take no action, your MPI code will be compiled with MPICH2. Check which implementation you're using like this: <pre> > which mpicc /usr/lib64/mpich2/bin/mpicc </pre> MPICH2 is '''not''' Infiniband-aware. The communication between processes will use the Ethernet network, and so the latency and total available bandwidth of IPC will be about an order of magnitude worse than it would be with Infiniband-aware implementations (see below). To run MPICH2 jobs, your job control system script should call <tt>/usr/lib64/mpich2/bin/mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/lib64/mpich2/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key line is the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=16:ppn=8 set, as in the example above, there will be 16*8=128 MPI processes (8 per node). === MVAPICH2 === MVAPICH2 uses native Infiniband system calls to communicate. This means that the latency of IPC can be as low as a few microseconds and the full bandwidth of the Infiniband interfaces is available. This does not work on the smp machines (hardware incompatible), use e.g. MPICH2 there instead or use the main queue. To use MVAPICH2 do <pre> module unload mpi/mpich-x86_64 module load mvapich2 </pre> Then you should see <pre> > which mpicc /usr/mpi/gcc/mvapich2-2.1a/bin/mpicc </pre> <tt></usr/mpi/gcc/mvapich2-2.1a/bin/mpiexec/tt> works for starting MVAPICH2 jobs: <pre> #!/bin/sh -f #PBS -N mvapich2-demo #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ /usr/mpi/gcc/mvapich2-2.1a/bin/mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> === OpenMPI === This should also be infiniband-aware. <pre>module load mpi/openmpi-x86_64</pre> and then proceed as above. 62c480ef28be291fe9f05a474add53a61d4e5800 GPUs 0 71 596 513 2019-06-17T20:06:05Z Mjh 2 wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1 is the main cluster gpu machine. It is currently the only machine in the <tt>gpu</tt> queue. The attached GPUs are 6 Tesla K80 units. * ramius has a single Tesla K40c. ramius is a private machine. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. You can do e.g. <tt>module load cuda-10.0</tt> to make sure the libraries and compilers are on your path. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. It may be sensible to bind your host-side process to cores physically in sockets that are connected to the PCI bus using the Linux process affinity setting commands. This will depend on your application --- left up to users at the moment. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Tensorflow == Tensorflow packages are available under python3.6. Make sure CUDA 10.0 is loaded (see above) and <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt> is on your PYTHONPATH. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. 594247ee14f7c73df15d65136f010b222bc3db4d 610 596 2019-12-12T14:22:07Z Mjh 2 wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1: The attached GPUs are 6 Tesla K80 units with 16GB VRAM. * gpu2 and gpu3: These both have 2 Tesla V100 units with 16 GB VRAM. * ramius has a single Tesla K40c. ramius is a private machine, the other machines are accessible through the gpu queue. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. You can do e.g. <tt>module load cuda-10.0</tt> to make sure the libraries and compilers are on your path. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. It may be sensible to bind your host-side process to cores physically in sockets that are connected to the PCI bus using the Linux process affinity setting commands. This will depend on your application --- left up to users at the moment. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Tensorflow == Tensorflow packages are available under python3.6. Make sure CUDA 10.0 is loaded (see above) and <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt> is on your PYTHONPATH. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. 95d545d678f3fbc13d63fc3418973d8ca5bffe53 Storage 0 8 597 565 2019-07-03T06:35:42Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 830 TB of beegfs storage nominally distributed as follows: * 360 TB: general use, under /beegfs/general * 145 TB: CAR, under /beegfs/car * 231 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 11f3ea4f95581d5c118a2692a3afb375a77e20c5 598 597 2019-07-19T10:30:34Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 951 TB of beegfs storage nominally distributed as follows: * 485 TB: general use, under /beegfs/general * 145 TB: CAR, under /beegfs/car * 231 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 9a5fa8cfd99bdf0c3a3ff338738ecb69cd600786 611 598 2020-01-20T12:25:05Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 1.1 PB of beegfs storage nominally distributed as follows: * 485 TB: general use, under /beegfs/general * 272 TB: CAR, under /beegfs/car * 231 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 2f31da734eae4053a433b38cbe4327ac83cb01b6 Administrators 0 6 599 550 2019-08-02T14:39:46Z Mjh 2 wikitext text/x-wiki == Administrators == UH users wanting help or support with UHHPC should ask for it through the UH helpdesk (e-mail helpdesk@herts.ac.uk or log in to [https://helpdesk.herts.ac.uk/ the web interface]). This includes a request for initial account creation. External users, such as external consortium users, should send e-mail to Martin Hardcastle (m.j.hardcastle@herts.ac.uk). 923e5226768800a85466a2d29cfd98b868291d3e 600 599 2019-08-02T14:40:40Z Mjh 2 /* Administrators */ wikitext text/x-wiki == Administrators == UH users wanting help or support with UHHPC should ask for it through the UH helpdesk (e-mail helpdesk@herts.ac.uk or log in to [https://helpdesk.herts.ac.uk/ the web interface]). This includes a request for initial account creation. Please make sure that your e-mail or help request includes 'UHHPC' in the subject line to make sure that it is routed through to the correct team. External users, such as external consortium users, should send e-mail to Martin Hardcastle (m.j.hardcastle@herts.ac.uk). df5c9679e378ac46b8a7cfa84dae10eb85387c5b 601 600 2019-08-05T17:25:13Z Mjh 2 wikitext text/x-wiki == Administrators == UH users wanting help or support with UHHPC should ask for it through the UH helpdesk (e-mail helpdesk@herts.ac.uk or log in to [https://helpdesk.herts.ac.uk/ the web interface]). This includes a request for initial account creation. Please make sure that your e-mail or help request includes 'UHHPC' in the subject line to make sure that it is routed through to the correct team. If you are asking for account creation, it will save time if you mention that you accept the [[terms of use]]. External users, such as external consortium users, should send e-mail to Martin Hardcastle (m.j.hardcastle@herts.ac.uk). 6da2c60dc649b9bddb9712ed6b9b831551e3f1f4 Shell 0 80 602 2019-09-20T06:01:36Z Mjh 2 Created page with "For historical reasons, the default shell on UHHPC is tcsh. The chsh command does not work on the cluster. If you wish to switch to bash, you must ask the [[administrators]]..." wikitext text/x-wiki For historical reasons, the default shell on UHHPC is tcsh. The chsh command does not work on the cluster. If you wish to switch to bash, you must ask the [[administrators]] to make the change. f87749be9b11c162c35af6ad2ba333df6800d9b8 LOFAR-UK Compute Facility 0 57 609 536 2019-10-22T15:34:10Z Mjh 2 wikitext text/x-wiki The LOFAR-UK compute facility is a part of the UH HPC facility reserved for LOFAR-UK use. It consists of a dedicated login machine and file server with 256 GB RAM and 16 cores, one dedicated compute node with 256 GB RAM and 32 cores, plus competitive access to the machines in the main cluster. If a big data analysis task is planned, a reservation can be made to ensure unrestricted access to computing power. LOFAR-UK users can get an account by contacting Martin Hardcastle. This will give them the ability to log in to <tt>lofar.herts.ac.uk</tt>. Data can be downloaded to the dedicated area <tt>/beegfs/lofar</tt>. Data processing should generally be carried out on the compute nodes rather than on lofar.herts.ac.uk. You can do your analysis by running interactive or non-interactive [[jobs]], as appropriate. See the [[LOFAR]] page for information on LOFAR software. A description of the [[generic pipeline]] is available. 241e6960cc4b92cb245eb865ec1034df5900c4a1 Software 0 17 614 561 2020-03-27T11:24:29Z Mjh 2 wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[CUDA]]: see [[GPU machines]] = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * <u>[[Gromacs]]</u>: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> b2326246c25869672449c34bbd1ae4c1bed62264 615 614 2020-03-27T11:28:45Z Mjh 2 wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * <u>[[Gromacs]]</u>: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> 01a80beb2aea399e4ff37e38d991ebd2f1afcdeb 616 615 2020-03-27T11:30:13Z Mjh 2 wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * <u>[[Gromacs]]</u>: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> 21beed65a52fc17ed27140c4f0a3fe409774ff48 619 616 2020-03-29T14:05:29Z H.patel 14 /* Molecular dynamics */ wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA * <u>[[Gromacs]]</u>: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> c398b8ba7ac9acc9a5dc3e7352c4ee19b189cf59 User:Mayaahorton 2 81 617 2020-03-27T13:45:33Z Mjh 2 Creating user page for new user. wikitext text/x-wiki Postgrad in computational astrophysics, working part-time on UHHPC support. 5c02a2b9df9c753269c3c9586e3a274bfc033078 User talk:Mayaahorton 3 82 618 2020-03-27T13:45:33Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 13:45, 27 March 2020 (UTC) 5011cc87da6369a07d18c77e7bda9efcd46acd3c NAMD 0 83 620 2020-03-29T14:16:22Z H.patel 14 Created page with "NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Enhanced sampling techniques such as accelerated MD simulati..." wikitext text/x-wiki NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Enhanced sampling techniques such as accelerated MD simulations can be performed using NAMD. To run NAMD, a shell script needs to be prepared as shown below and edited according to your needs. Submit the job to the cluster with the qsub command, e.g. qsub runjobNAMD.sh <i>Hershna Patel<i> 752ea7dc63192eca2cc70bf8946c1b1704fd6010 621 620 2020-03-29T14:24:49Z H.patel 14 wikitext text/x-wiki NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Enhanced sampling techniques such as accelerated MD simulations can be performed using NAMD. To run NAMD, a shell script needs to be prepared as shown below and edited according to your needs. Submit the job to the cluster with the qsub command, e.g. qsub runjobNAMD.sh <i>Hershna Patel<i> ________________________________________________________________________________________ <pre>#!/bin/sh #PBS -N NamdTest #PBS -q gpu1 #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -u hpatel # -N runs a job with name 'NamdTest' on the gpu machine on the cluster # -q job starts up on gpu1 # -l set a maximum time of forty eight hours (wall-clock time) # -j merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # -u specifies user 'hpatel' # set required path: source /soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA # specify working directory: cd /home/hpatel/... ### This is the command ### ./namd2 +idlepoll +p16 configfile.namd > output.log ### command end ### # start with 'qsub runjobNAMD.sh' 6104771a050868ef7245a99909e5ee0c9a862660 622 621 2020-03-29T14:25:36Z H.patel 14 wikitext text/x-wiki NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Enhanced sampling techniques such as accelerated MD simulations can be performed using NAMD. To run NAMD, a shell script needs to be prepared as shown below and edited according to your needs. Submit the job to the cluster with the qsub command, e.g. qsub runjobNAMD.sh <i>Hershna Patel<i> <pre>#!/bin/sh #PBS -N NamdTest #PBS -q gpu1 #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -u hpatel # -N runs a job with name 'NamdTest' on the gpu machine on the cluster # -q job starts up on gpu1 # -l set a maximum time of forty eight hours (wall-clock time) # -j merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # -u specifies user 'hpatel' # set required path: source /soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA # specify working directory: cd /home/hpatel/... ### This is the command ### ./namd2 +idlepoll +p16 configfile.namd > output.log ### command end ### # start with 'qsub runjobNAMD.sh' fdbe36965a58ab312e299c7c768b48ef029fe417 623 622 2020-03-29T14:27:59Z H.patel 14 wikitext text/x-wiki NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Enhanced sampling techniques such as accelerated MD simulations can be performed using NAMD. To run NAMD v2.13, a shell script needs to be prepared as shown below and edited according to your needs. Submit the job to the cluster with the qsub command, e.g. qsub runjobNAMD.sh <i>Hershna Patel<i> <pre>#!/bin/sh #PBS -N NamdTest #PBS -q gpu1 #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -u hpatel # -N runs a job with name 'NamdTest' on the gpu machine on the cluster # -q job starts up on gpu1 # -l set a maximum time of forty eight hours (wall-clock time) # -j merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # -u specifies user 'hpatel' # set required path: source /soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA # specify working directory: cd /home/hpatel/... ### This is the command ### ./namd2 +idlepoll +p16 configfile.namd > output.log ### command end ### # start with 'qsub runjobNAMD.sh' 579aa47aee79dabf4e29ce5785982c211e5342a2 R 0 84 624 2020-04-01T13:02:53Z Mjh 2 Created page with "R is installed on head node and compute node machines by default. To use system-wide installations of packages set <tt>R_LIBS</tt> to <tt>/soft/R</tt>." wikitext text/x-wiki R is installed on head node and compute node machines by default. To use system-wide installations of packages set <tt>R_LIBS</tt> to <tt>/soft/R</tt>. 7b2b3fad4f8142b6445ecbabdcfa9f6aa1bdd04b Jobs 0 9 625 543 2020-04-10T10:12:31Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://docs.adaptivecomputing.com/torque/help.htm Torque] (formerly known as PBS) and [http://docs.adaptivecomputing.com/maui/index.php Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/qsub.htm online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=64 job.sh ## run for two hours on 64 CPUs (which may be spread over up to 64 physical nodes) qsub -l walltime=1:0:0 -l nodes=2:ppn=32 job.sh ## run for one hour on 2 nodes with 32 CPUs per node qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes on one node for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. Do not attempt to use these options unless you know what you are doing. Note also that in the present configuration requests for fewer processors per node than are available may be consolidated onto fewer nodes. This should never be a problem unless you are specifically testing inter-node communication, in which case you could specify node names to force running on different nodes. For efficiency, you are recommended always to ask for and use the maximum number of processes per node for MPI or multi-threaded code -- that is, 32 processes for the main cluster. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). You can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qalter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. Look at the <tt>man</tt> pages for these commands for more information. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/2-jobs/exportedBatchEnvVar.htm here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/pbsdsh.htm pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> setenv OMP_NUM_THREADS `cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. For a large array you have the option to limit the number of jobs that will run concurrently -- perhaps because they all want access to IO resources and will compete with each other and run out of walltime if they all run at once. So <pre> qsub -t 1-1000%20 myjob.qsub </pre> will run 1000 versions of the job but ensure that only 20 are running at a given time. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 5282c269bedac59ea7b0a6888a06856682992eb2 Architecture 0 7 626 584 2020-04-10T10:13:16Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis6-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * dstorage1-10, file servers providing 830 TB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 3ade3eb3d24dd126324ea0b58bb2a315675ec0d0 627 626 2020-04-10T10:13:42Z Mjh 2 /* Servers and dedicated login nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis6-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * metadata and dstorage1-12, file servers providing 830 TB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 1efd3e1434d7e0aeb888bf0df29b58adeea2fe6d 628 627 2020-04-10T10:14:20Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis6-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). smp4 is dedicated to LOFAR-UK use. * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * metadata and dstorage1-12, file servers providing 830 TB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 4b90ca254c302a344c61b649ce12df8b17962627 629 628 2020-04-10T10:14:34Z Mjh 2 /* Servers and dedicated login nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis6-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). smp4 is dedicated to LOFAR-UK use. * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * A [[GPUs|GPU]] machine, gpu1, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * metadata and dstorage1-12, file servers providing 1.1 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg e1fa733580142d2c5a27ff32ab78efc53e9810bf 630 629 2020-04-10T10:15:16Z Mjh 2 /* compute nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis6-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form part of the CAR cluster (node097-node112: chassis 7), in the car queue * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the rest of the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). smp4 is dedicated to LOFAR-UK use. * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * Three [[GPUs|GPU]] machines, gpu1-3, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * metadata and dstorage1-12, file servers providing 1.1 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 33c2a5d3aa11266bdc18d0ab440eb687b4d8c4c8 654 630 2020-09-30T18:14:00Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 2 head nodes: headnode1 and headnode2, each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == Older compute nodes are in 16-blade chassis: currently chassis6-9 are populated. The newer compute nodes, which will be used by most people, are in three racks, rack1-3. The smp machines are outside any particular rack or chassis. * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband form part of the Main cluster (node001-080: rack1, rack2, rack3), in the main queue * 12 Xeons (X5650s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAIR cluster (node081-92: chassis 6), in the cair_l queue * 2 Xeons (E5520s) 2 socket x 4-core (no hyperthreading) with 24 GB RAM and DDR Infiniband form part of the main cluster (node093-94: chassis 6), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband form part of the Main cluster (node097-112: rack4) in the test queue. * 16 Xeons (X5675s) 2 socket x 6-core (no hyperthreading) with 24 GB RAM and QDR Infiniband form the CAR cluster (node113-128: chassis 8), in the car queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband form the rest of the main cluster (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[SMP machines]], two with 48 cores (4 sockets x 12 cores, Opteron 6174) and two with 32 (one 4 sockets x 8 cores, Xeon 4620), all with 256 GB RAM and QDR Infiniband (smp1-4). smp4 is dedicated to LOFAR-UK use. * Two SMP blades in chassis6 (node095-096) with 32 cores (2 sockets x 16 cores) and 256 GB RAM, in the smp queue * Three [[GPUs|GPU]] machines, gpu1-3, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * metadata and dstorage1-12, file servers providing 1.1 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 7610cc6fc7ac1c275f0cce5e0a3f7d0f8029e5fa Ciao 0 66 631 436 2020-04-15T17:52:22Z Mjh 2 wikitext text/x-wiki CIAO is the Chandra data reduction software. Access it by doing <tt>source /soft/ciao-4.12/ciao-4.12/bin/ciao.csh</tt> dce835a76a9e460bf5b91419222e7a457f5c01c9 Software 0 17 632 619 2020-04-15T17:53:14Z Mjh 2 /* Astronomical software */ wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA * <u>[[Gromacs]]</u>: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> d3c94d11ed660e21a4ca525668bd5030d72e9261 640 632 2020-06-10T17:38:34Z Mjh 2 wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA * <u>[[Gromacs]]</u>: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> b555f59de096091d54e551e6ae7aa15a537acb24 641 640 2020-06-10T17:38:46Z Mjh 2 /* Astronomical software */ wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA * <u>[[Gromacs]]</u>: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> a83230b44600bc39ab65a8502ca644353ac2df6b 642 641 2020-06-10T17:39:29Z Mjh 2 wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA</tt> * [[Gromacs]]: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> 76ccae74b69e46d6aa4a72061e54dcfb298026c5 SAS 0 85 633 2020-04-15T20:23:01Z Mjh 2 Created page with "SAS is the XMM data reduction software. To run SAS you must first have HEADAS set up. In bash <pre> export HEADAS=/soft/heasoft-6.27.1/x86_64-pc-linux-gnu-libc2.17 . $HEADAS..." wikitext text/x-wiki SAS is the XMM data reduction software. To run SAS you must first have HEADAS set up. In bash <pre> export HEADAS=/soft/heasoft-6.27.1/x86_64-pc-linux-gnu-libc2.17 . $HEADAS/headas-init.sh </pre> In tcsh <pre> setenv HEADAS /soft/heasoft-6.27.1/x86_64-pc-linux-gnu-libc2.17 source $HEADAS/headas-init.csh </pre> Then source SAS: <pre> source /soft/xmmsas_20190531_1155/setsas.sh </pre> or <pre> source /soft/xmmsas_20190531_1155/setsas.csh </pre> Set the <tt>SAS_CCFPATH</tt> variable to <tt>/beegfs/car/XMM/CCF</tt>. 4cc37cf948f7a85c531bad24dbe008e4f9045009 Known problems 0 25 634 484 2020-04-24T08:57:08Z Mjh 2 wikitext text/x-wiki == Known problems == * Although the Infiniband links carry NFS traffic between the nodes and the head node, they are not using RDMA and so their latency is higher and bandwidth lower than it should be. This seems to be a problem with the Linux kernels on the nodes; NFS over RDMA+infiniband is unstable. Less of an issue now most IO goes to /beegfs which does use RDMA. * The scheduler sometimes crashes for unknown reasons causing jobs not to run. (Regularly run scripts check and restart the scheduler.) * The scheduler very occasionally will not run a job that could be run immediately in free resources. If <tt>checkjob</tt> shows 'job can run' and <tt>showstart</tt> shows an immediate start, but your job does not run, then please contact the [[administrators]]. * Node specifications of the form <tt>nodes=main:ppn=16</tt> or <tt>nodes=smp:ppn=1</tt> will severely confuse the scheduler, although they are valid. Please do not use queue names in node specifications: always do something like <tt>-q main -l nodes=1:ppn=16</tt> instead. == Node hardware/sw (for admin use only) == * node087, node093, node103: hardware failures * node143 -- unstable clock issue a320759c59d2ede4fd391e69477f4bc13e666d4e Shell 0 80 635 602 2020-05-01T11:17:16Z Mjh 2 wikitext text/x-wiki For historical reasons, the default shell on UHHPC is tcsh. The chsh command does not work on the cluster. If you wish to switch to bash, you must ask the [[administrators]] to make the change. There is a bug in tcsh history processing which means that the .history file may become corrupt after you have run many jobs. To fix this temporarily, remove your .history file. To fix it permanently (at the cost of no longer having persistent history), do <pre> cd rm .history ln -s /dev/null .history </pre> bash does not suffer from this problem. 481d8a6e61d51848a74a969c52cb1b3012912d4d Python packages 0 49 636 567 2020-05-13T12:51:44Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * astropy [http://pypi.python.org/packages/source/a/astropy/astropy-0.2.4.tar.gz] * kapteyn [http://www.astro.rug.nl/software/kapteyn/] * h5py * mpi4py * hcluster Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these available. Please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 assumed): <pre> python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip. For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> 0e49bacca683d0216fa7706a277fc0fd67879e1b 637 636 2020-05-13T12:54:41Z Mjh 2 /* Python virtual environments */ wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * astropy [http://pypi.python.org/packages/source/a/astropy/astropy-0.2.4.tar.gz] * kapteyn [http://www.astro.rug.nl/software/kapteyn/] * h5py * mpi4py * hcluster Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these available. Please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 assumed): <pre> python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip. For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> e273ce01e09f2a4a17b77ca824aa5e40b5bd3b61 638 637 2020-05-13T12:56:12Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * astropy [http://pypi.python.org/packages/source/a/astropy/astropy-0.2.4.tar.gz] * kapteyn [http://www.astro.rug.nl/software/kapteyn/] * h5py * mpi4py * hcluster Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these available. Please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 assumed): <pre> python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip. For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. b0313cfb7475d922a20266f77d7ff95cf7db0935 639 638 2020-06-02T07:49:19Z Mjh 2 /* Python 3.6 */ wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * astropy [http://pypi.python.org/packages/source/a/astropy/astropy-0.2.4.tar.gz] * kapteyn [http://www.astro.rug.nl/software/kapteyn/] * h5py * mpi4py * hcluster Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these available. Please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to include <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>, and make sure no Python 2 packages are on your PYTHONPATH. If you want Ipython3 add <tt>/soft/python3/usr/local/bin</tt> to your PATH. <tt>module load python3</tt> will make these changes for you. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 assumed): <pre> python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip. For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. 6e9f66d4acacb5707bd1ae197b4503be9ce7892b 656 639 2020-10-07T08:53:10Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * numpy * scipy * astropy * tensorflow * h5py * mpi4py Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these and many more available. You can install your own local copies of packages that don't exist on the system using <tt>pip install --user PACKAGENAME</tt>. These will appear in your <tt>.local</tt> directory. However, please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to include <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>, and make sure no Python 2 packages are on your PYTHONPATH. If you want Ipython3 add <tt>/soft/python3/usr/local/bin</tt> to your PATH. <tt>module load python3</tt> will make these changes for you. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 assumed): <pre> python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip. For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. 5dd8852b8d47141fb4b77430e01f41f851aa6045 657 656 2020-10-07T08:53:55Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * numpy * scipy * astropy * tensorflow * h5py * mpi4py Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these and many more available. You can install your own local copies of packages that don't exist on the system using <tt>pip install --user PACKAGENAME</tt>. These will appear in your <tt>.local</tt> directory. However, please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to include <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>, and make sure no Python 2 packages are on your PYTHONPATH. If you want Ipython3 add <tt>/soft/python3/usr/local/bin</tt> to your PATH. <tt>module load python3</tt> will make these changes for you. <tt>pip3</tt> can be used to install local copies of python3 packages. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 assumed): <pre> python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip. For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. 0ab8b00b2e6de65e79bda8e1fabae2fd98be98d3 658 657 2020-10-07T08:54:25Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * numpy * scipy * astropy * tensorflow * h5py * mpi4py Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these and many more available. You can install your own local copies of packages that don't exist on the system using <tt>pip install --user PACKAGENAME</tt>. These will appear in your <tt>.local</tt> directory. However, please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to include <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>, and make sure no Python 2 packages are on your PYTHONPATH. If you want Ipython3 add <tt>/soft/python3/usr/local/bin</tt> to your PATH. <tt>module load python3</tt> will make these changes for you. <tt>pip3</tt> can be used to install local copies of python3 packages. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 assumed): <pre> python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip (without <tt>--user</tt> option). For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. 930ca907503b21db125e00135a10184073125903 Modules 0 33 643 193 2020-06-16T20:37:38Z Mjh 2 wikitext text/x-wiki The cluster uses the <tt>environment-modules</tt> package to manage environment variable settings for software packages. When a 'module' is loaded, the appropriate environment settings are added to the start of your PATH, MANPATH, LD_LIBRARY_PATH etc, and other variables used by the relevant package may be set as well. When a module is unloaded, the changes made to environment variables are undone. Documentation of this package is available at this link[http://modules.sourceforge.net/], or type <tt>man module</tt> Basic commands include: * <tt>module list</tt>. See what modules you have loaded. * <tt>module avail</tt>. List what modules are available to you. * <tt>module load [modulename]</tt>. Load a module, e.g. <tt>module load mpich2-local</tt> * <tt>module unload [modulename]</tt>. Unload a module. * <tt>module show [modulename]</tt>. Show what loading a module does. See <tt>module list</tt> for a list of currently available modules. You may use <tt>module</tt> commands in your .bashrc or .cshrc. For example, I have <pre> module unload mpich2-x86_64 module load mpich2-local </pre> as the first two lines of my .cshrc. Module commands do not work in job scripts or scripts run by jobs because the relevant aliases are only set up by login shells. This means to get the effect of loading a module you should either manually set environment variables as described in <tt>module show</tt> or do <pre> eval `/usr/bin/modulecmd [shell] load [module]` </pre> where <tt>[shell]</tt> is the name of the shell you are using. We are happy to add other environments as modules -- please contact the cluster [[Administrators]]. 9f861f8f6bd454406f949d4b4ad342ec090067d9 653 643 2020-09-30T18:11:43Z Mjh 2 wikitext text/x-wiki The cluster uses the <tt>environment-modules</tt> package to manage environment variable settings for software packages. When a 'module' is loaded, the appropriate environment settings are added to the start of your PATH, MANPATH, LD_LIBRARY_PATH etc, and other variables used by the relevant package may be set as well. When a module is unloaded, the changes made to environment variables are undone. Documentation of this package is available at this link[http://modules.sourceforge.net/], or type <tt>man module</tt> Basic commands include: * <tt>module list</tt>. See what modules you have loaded. * <tt>module avail</tt>. List what modules are available to you. * <tt>module load [modulename]</tt>. Load a module, e.g. <tt>module load mpich2-local</tt> * <tt>module unload [modulename]</tt>. Unload a module. * <tt>module show [modulename]</tt>. Show what loading a module does. See <tt>module list</tt> for a list of currently available modules. You may use <tt>module</tt> commands in your .bashrc or .cshrc, e.g. to select your preferred [[MPI]] environment. Module commands do not work in job scripts or scripts run by jobs because the relevant aliases are only set up by login shells. This means to get the effect of loading a module you should either manually set environment variables as described in <tt>module show</tt> or do <pre> eval `/usr/bin/modulecmd [shell] load [module]` </pre> where <tt>[shell]</tt> is the name of the shell you are using. We are happy to add other environments as modules -- please contact the cluster [[Administrators]]. d03d2f02424fc8fa14148e1f4b3745cf1b952f38 To do 0 78 644 590 2020-09-19T08:57:46Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below == Notes == To update using lifecycle controller: * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> b267d796940c199dab55d2613f8c6b8ca07375e4 645 644 2020-09-19T09:05:39Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> 18b65e3830cd4d71e802f9c08cc4929988a18d5c 664 645 2021-01-22T09:19:48Z Mjh 2 /* To do next downtime */ wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 * rename test queue to large and think about min job size * restart IB switch for node025 * install dstorage14 (-: == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> d914495373f84be81eef0f5d40a5a8d3f6045de9 665 664 2021-01-27T09:36:13Z Mjh 2 /* To do next downtime */ wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 * rename test queue to large and think about min job size * restart IB switch for node025 * install dstorage14 (-: * get three chassis6 nodes working == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> b151b78973950f822952e07fce07c68041919845 666 665 2021-02-03T17:22:04Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 * rename test queue to large and think about min job size * restart IB switch for node025 * install dstorage14 (-: * get three chassis6 nodes working * reboot lofar and other head nodes for security/stability == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> 2d7656d756d8075b522dca4029441692e7125591 667 666 2021-02-03T20:01:48Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 * rename test queue to large and think about min job size * restart IB switch for node025 * sort infiniband on dstorage14 * get three chassis6 nodes working * reboot lofar and other head nodes for security/stability == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> 4cd20d1682306a2626dc8c343a2678eb407b7821 668 667 2021-02-08T20:05:03Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 * rename test queue to large and think about min job size * restart IB switch for node025 * sort infiniband on dstorage14 * get three chassis6 nodes working * reboot lofar and other head nodes for security/stability * sort out network speed of em4 on head.data == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> 7db4cfaa0ad97bc9ca1ea3295742c10590dd176f 669 668 2021-02-12T12:08:39Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 [DONE] * rename test queue to large and think about min job size * restart IB switch for node025 * sort infiniband on dstorage14 * get three chassis6 nodes working * reboot lofar and other head nodes for security/stability * sort out network speed of em4 on head.data == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> d30d94e61c37b799947d2865bc6382491828899d 670 669 2021-02-12T13:06:45Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 [DONE] * rename test queue to large and think about min job size * restart IB switch for node025 * sort infiniband on dstorage14 [DONE] * get three chassis6 nodes working * reboot lofar and other head nodes for security/stability * sort out network speed of em4 on head.data [DONE] == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> d018f7b0433867031c9080faa6749d6edf8a7dd9 671 670 2021-02-12T14:30:50Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * drive firmware upgrade on dstorage13 [DONE] * rename test queue to large and think about min job size [DONE] * restart IB switch for node025 [DONE] * sort infiniband on dstorage14 [DONE] * get three chassis6 nodes working * reboot lofar and other head nodes for security/stability [DONE] * sort out network speed of em4 on head.data [DONE] == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> a49bf406b7cde926c7f63ceccbe492ef8af5a0da 673 671 2021-02-14T14:26:04Z Mjh 2 wikitext text/x-wiki = To do next downtime = * beegfs upgrade * check IB firmware on dstorage nodes, see below * get three chassis6 nodes working == Notes == === To update using lifecycle controller === * run chromium-browser, unblock popups * log in to nodexxx.management * select 'launch virtual console' * select lifecycle controller from boot menu and reboot (console controls -- ctrl-alt-del -- apply) * wait for boot to lifecycle controller * click Ok to all options and wait for networking, ignore ipv6 warning * click 'get the latest firmware' * select 'ftp server' * say 'ftp.dell.com' NOT 'downloads.dell.com' * wait... * click 'apply' * wait... * the machine may power off, power it on again (you may need to restart virtual console) * eventually the machine will reboot to the lifecycle controller again. Now click 'exit' at top right. * close the virtual console when a Linux console prompt is showing. === To update IB firmware === <pre> /soft/etc/mft-4.15.1-9-x86_64-rpm/install.sh mst start flint -d /dev/mst/mt4123_pciconf0 q cp /soft/etc/fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin /root flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_26_4300-0Y1T43_07TKND_Ax-UEFI-14.19.17-FlexBoot-3.5.805.signed.bin b mlxfwreset -d /dev/mst/mt4123_pciconf0 reset </pre> where device is whatever appears in /dev/mst and firmware should be downloaded from https://www.mellanox.com/support/firmware/dell == IB problem list == Returned using iblinkinfo on a new OFED install. Doesn't work with standard one. <pre> 40 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 252 1[ ] "dstorage5 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 7[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 392 1[ ] "metadata mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 36 1[ ] "dstorage1 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 9[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 160 1[ ] "lofar-server mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 10[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 4 1[ ] "uhhpc mlx4_0" ( Could be 10.0 Gbps) 40 13[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 432 1[ ] "smp4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 14[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 224 1[ ] "dstorage4 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 15[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 12 1[ ] "dstorage3 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 40 16[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 216 1[ ] "dstorage2 mlx4_0" ( Could be FDR10 (Found link at QDR but expected speed is FDR10)) 41 3[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 176 1[ ] "dstorage11 mlx4_0" ( Could be 14.0625 Gbps) 41 4[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 368 1[ ] "dstorage9 mlx4_0" ( Could be 14.0625 Gbps) 41 24[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 396 1[ ] "dstorage7 mlx4_0" ( Could be 14.0625 Gbps) 41 25[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 384 1[ ] "dstorage8 mlx4_0" ( Could be 14.0625 Gbps) 41 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 228 1[ ] "dstorage6 mlx4_0" ( Could be 14.0625 Gbps) 353 27[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 580 1[ ] "node025 mlx4_0" ( Could be 14.0625 Gbps) 1 8[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 748 1[ ] "node103 mlx5_0" ( Could be 14.0625 Gbps) 1 26[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 716 1[ ] "node112 mlx5_0" ( Could be 14.0625 Gbps) </pre> ef740562099dd1eadb9e3ef7d853e532841590de MPI 0 12 646 595 2020-09-30T14:47:57Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>. == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). ae672c48e41450044793b6c988e62be1ee781702 647 646 2020-09-30T15:06:39Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>. == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). == List of MPI implementations == The available implementations are listed in the table below. {| !Module name !Name !MPI version !Infiniband? |- |openmpi-4.0.5||OpenMPI 4.0.5||4||Y |- |mvapich2-2.3||MVAPICH2 2.3||2||Y |- |intel-mpi||Intel MPI Library||3||Y |- |mpich2-local||MPICH2||2||N |- |mpi/mpich-3.0-x86_64||MPICH 3.0||3||N |- |mpi/mpich-3.2-x86_64||MPICH 3.2||3||N |- |} In general you should select an implementation which implements the latest version of MPI, unless your code is too old to use that, and you should always select an Infiniband-aware implementation if you plan to run on multiple nodes. 5c9ccabaf23af630c8c59246ad2d94f1880e0864 648 647 2020-09-30T15:07:04Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>). == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). == List of MPI implementations == The available implementations are listed in the table below. {| !Module name !Name !MPI version !Infiniband? |- |openmpi-4.0.5||OpenMPI 4.0.5||4||Y |- |mvapich2-2.3||MVAPICH2 2.3||2||Y |- |intel-mpi||Intel MPI Library||3||Y |- |mpich2-local||MPICH2||2||N |- |mpi/mpich-3.0-x86_64||MPICH 3.0||3||N |- |mpi/mpich-3.2-x86_64||MPICH 3.2||3||N |- |} In general you should select an implementation which implements the latest version of MPI, unless your code is too old to use that, and you should always select an Infiniband-aware implementation if you plan to run on multiple nodes. 608846c886fa326a4acd450fa3ebab74ee65ef61 649 648 2020-09-30T15:11:33Z Mjh 2 /* List of MPI implementations */ wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>). == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). == List of MPI implementations == The available implementations are listed in the table below. {| !Module name !Name !MPI version !Infiniband? |- |openmpi-4.0.5||OpenMPI 4.0.5||3||Y |- |mvapich2-2.3||MVAPICH2 2.3||2||Y |- |intel-mpi||Intel MPI Library||3||Y |- |mpich2-local||MPICH2||2||N |- |mpi/mpich-3.0-x86_64||MPICH 3.0||2||N | |mpi/mpich-3.2-x86_64||MPICH 3.2||2||N |- |} In general you should select an implementation which implements the latest version of MPI, unless your code is too old to use that, and you should always select an Infiniband-aware implementation if you plan to run on multiple nodes. 3270db6a013f3dca5b7f3bb72268f9f3c7158199 650 649 2020-09-30T15:11:54Z Mjh 2 /* List of MPI implementations */ wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>). == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). == List of MPI implementations == The available implementations are listed in the table below. {| !Module name !Name !MPI version !Infiniband? |- |openmpi-4.0.5||OpenMPI 4.0.5||3||Y |- |mvapich2-2.3||MVAPICH2 2.3||2||Y |- |intel-mpi||Intel MPI Library||3||Y |- |mpich2-local||MPICH2||2||N |- |mpi/mpich-3.0-x86_64||MPICH 3.0||2||N |- |mpi/mpich-3.2-x86_64||MPICH 3.2||2||N |- |} In general you should select an implementation which implements the latest version of MPI, unless your code is too old to use that, and you should always select an Infiniband-aware implementation if you plan to run on multiple nodes. 2c1ef2177b7b49e62b1fa138133a70e67f267ddd 651 650 2020-09-30T15:13:42Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>). == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). == List of MPI implementations == The available implementations are listed in the table below. {| !Module name !Name !MPI version !Infiniband? |- |openmpi-4.0.5||OpenMPI 4.0.5||3||Y |- |mvapich2-2.3||MVAPICH2 2.3||2||Y |- |intel-mpi||Intel MPI Library||3||Y |- |mpich2-local||MPICH2||2||N |- |mpi/mpich-3.0-x86_64||MPICH 3.0||2||N |- |mpi/mpich-3.2-x86_64||MPICH 3.2||2||N |- |} In general you should select an implementation which implements the latest version of MPI, unless your code is too old to use that, and you should always select an Infiniband-aware implementation if you plan to run on multiple nodes. dec4b1ee65dbcc41b139803afe3b2d323590315e 652 651 2020-09-30T15:14:33Z Mjh 2 /* Running in a job */ wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>). == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. So you will see that you don't have to specify anything else other than the executable to run on the mpiexec line -- mpiexec will do the right thing. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). == List of MPI implementations == The available implementations are listed in the table below. {| !Module name !Name !MPI version !Infiniband? |- |openmpi-4.0.5||OpenMPI 4.0.5||3||Y |- |mvapich2-2.3||MVAPICH2 2.3||2||Y |- |intel-mpi||Intel MPI Library||3||Y |- |mpich2-local||MPICH2||2||N |- |mpi/mpich-3.0-x86_64||MPICH 3.0||2||N |- |mpi/mpich-3.2-x86_64||MPICH 3.2||2||N |- |} In general you should select an implementation which implements the latest version of MPI, unless your code is too old to use that, and you should always select an Infiniband-aware implementation if you plan to run on multiple nodes. 2b5ac1f6ca7758431c361d352cc2adc420da363f 672 652 2021-02-14T09:36:50Z Mjh 2 wikitext text/x-wiki == What is MPI? == MPI stands for ''Message Passing Interface''. It provides a standard for communication between processes either running on the same host or over a network. Programs written using this standard can communicate with each other without needing to know details of the way in which they are connected. For a more detailed description see the [http://en.wikipedia.org/wiki/Message_Passing_Interface wikipedia page]. The MPI standard (with extensions in some cases) has been implemented by many different groups. MPI tutorials are widely available on the web. == MPI on the cluster == There are several implementations of MPI on the cluster (see full list below). All of them provide compilation tools (mpicc, mpif77), libraries, and a means of running code. All MPI libraries on the cluster know about, and will use, your current allocation of nodes when you have run a [[jobs|job]]. You should read the page on [[jobs]] before trying to do anything with any of the example scripts given here! All MPI implementations are provided using the [[modules|module]] system and so largely the way to use them is interchangeable between different implementations. At the time of writing (Sep 2020) we recommend using OpenMPI-4.0.5 as it appears to provide the fastest use of the Infiniband network that link the nodes. Some MPI implementations do not use Infiniband at all and these should be avoided if possible. == Compiling == Make sure the correct module is loaded: <pre> module load openmpi-4.0.5 </pre> Now compile your MPI code as normal (e.g. <tt>mpicc</tt>, <tt>mpif77</tt>). == Running in a job == Your job control system script should call the correct version <tt>mpiexec</tt>: <pre> #!/bin/sh -f #PBS -N mpi-demo #PBS -m abe #PBS -l nodes=2:ppn=32 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ eval `/usr/bin/modulecmd bash load openmpi-4.0.5` mpiexec /home/myusername/mympijob echo ------------------------------------------------------ echo Job ends </pre> <tt>mpiexec</tt> uses the Torque environment variables to start processes up on the appropriate nodes. So you will see that you don't have to specify anything else other than the executable to run on the mpiexec line -- mpiexec will do the right thing. Note that in this example the '''only''' key lines are (1) the selection of the right MPI library and (2) the one calling <tt>mpiexec</tt>. The others are Torque directives or are there for debugging/demonstration purposes. By default, <tt>mpiexec</tt> will run as many processes as there are processors available to it: so if you have nodes=2:ppn=32 set, as in the example above, there will be 2*32 MPI processes (32 per node). == Warning == You may see warning messages as follows: <pre> -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device. Local host: node137 Local device: mlx4_0 -------------------------------------------------------------------------- ... [node137:15044] 31 more processes have sent help message help-mpi-btl-openib.txt / error in device init [node137:15044] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages </pre> These messages arise because in this version of OpenMPI there are two ways of trying to initialize the infinband cards and they conflict. The messages are harmless but can be suppressed by using: <tt>mpiexec --mca btl '^openib' ...</tt> == List of MPI implementations == The available implementations are listed in the table below. {| !Module name !Name !MPI version !Infiniband? |- |openmpi-4.0.5||OpenMPI 4.0.5||3||Y |- |mvapich2-2.3||MVAPICH2 2.3||2||Y |- |intel-mpi||Intel MPI Library||3||Y |- |mpich2-local||MPICH2||2||N |- |mpi/mpich-3.0-x86_64||MPICH 3.0||2||N |- |mpi/mpich-3.2-x86_64||MPICH 3.2||2||N |- |} In general you should select an implementation which implements the latest version of MPI, unless your code is too old to use that, and you should always select an Infiniband-aware implementation if you plan to run on multiple nodes. 026df85578754503ee62340aca0748e1aa8da9b0 GPUs 0 71 655 610 2020-09-30T18:15:43Z Mjh 2 wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1: The attached GPUs are 6 Tesla K80 units with 16GB VRAM. * gpu2 and gpu3: These both have 3 Tesla V100 units with 16 GB VRAM each on gpu3 and 32 GB VRAM on gpu2. * ramius has a single Tesla K40c. ramius is a private machine, the other machines are accessible through the gpu queue. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. You can do e.g. <tt>module load cuda-10.0</tt> to make sure the libraries and compilers are on your path. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Tensorflow == Tensorflow packages are available under python3.6. Make sure CUDA 10.0 is loaded (see above) and <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt> is on your PYTHONPATH. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. 02ad84907983f9fb007c31d582c388120778496d 663 655 2021-01-14T18:28:33Z Mjh 2 /* Tensorflow */ wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1: The attached GPUs are 6 Tesla K80 units with 16GB VRAM. * gpu2 and gpu3: These both have 3 Tesla V100 units with 16 GB VRAM each on gpu3 and 32 GB VRAM on gpu2. * ramius has a single Tesla K40c. ramius is a private machine, the other machines are accessible through the gpu queue. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. You can do e.g. <tt>module load cuda-10.0</tt> to make sure the libraries and compilers are on your path. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Tensorflow == Tensorflow packages are available under python3.6. Make sure CUDA 10.0 is loaded (see above) and <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt> is on your PYTHONPATH e.g. by doing <tt>module load python3 cuda-10.0</tt>. If you are running on a GPU machine you will then get GPU acceleration in Tensorflow. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. eb3c5cb10d040c1a8c32e1dabe206f17f48838e5 Accounts 0 3 659 551 2020-10-13T16:10:38Z Mayaahorton 17 wikitext text/x-wiki Accounts are available to all staff and research students of UH, and to others by special arrangement. Access is granted subject to the [[Terms of use]] of the cluster and to observance of our usage [[policies]]. Accounts may be suspended or cancelled if the cluster is abused: see also our [[Account cancellation policy]]. To get an account, contact the [[administrators]]. When doing so, please provide a valid email address and indicate that you have read the [[terms of use]] and [[policies]]. Your login details will be emailed to you. If you do not specify a preferred username (typically a combination of your first and last name, initials, or the first part of your email address), one will be assigned to you. Your username is visible to others. If you are at UH, please let us know which department you belong to. External users should specify which group they are working with, such as WEAVE or LOFAR. eb6b31396778c8e022e8266906406a6bd5452f83 674 659 2021-04-22T17:33:21Z Mjh 2 wikitext text/x-wiki Accounts are available to all staff and research students of UH, and to others by special arrangement. Access is granted subject to the [[Terms of use]] of the cluster and to observance of our usage [[policies]]. Accounts may be suspended or cancelled if the cluster is abused: see also our [[Account cancellation policy]]. To get an account, contact the [[administrators]]. When doing so, please provide a valid email address and indicate that you have read the [[terms of use]] and [[policies]]. Your login details will be emailed to you. If you do not specify a preferred username (typically your UH username if you have one; optionally, a combination of your first and last name, initials, or the first part of your email address), one will be assigned to you. Your username is visible to others. If you are at UH, please let us know which department you belong to. External users should specify which group they are working with, such as WEAVE or LOFAR. 956d2823e174adcac1a3d7594ef344739ce7e356 Administrators 0 6 660 601 2020-10-13T16:15:48Z Mayaahorton 17 /* Administrators */ wikitext text/x-wiki == Administrators == UH users wanting help or support with UHHPC should ask for it through the UH helpdesk (e-mail helpdesk@herts.ac.uk or log in to [https://helpdesk.herts.ac.uk/ the web interface]). Please make sure that your e-mail or help request includes 'UHHPC' in the subject line to make sure that it is routed through to the correct team. Failure to do so can result in very long delays. For account requests, please provide your contact details and department or working group as specified on the [[accounts]] page. External users, such as external consortium users, should send e-mail to Martin Hardcastle (m.j.hardcastle@herts.ac.uk). 692a1607727e9c8ce17246063adf40f1daf75ee9 Storage 0 8 661 611 2020-11-13T15:49:43Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage. There is 1.1 PB of beegfs storage nominally distributed as follows: * 485 TB: general use, under /beegfs/general * 272 TB: CAR, under /beegfs/car * 480 TB: LOFAR-UK, under /beegfs/lofar * 90 TB: CAIR, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 5ac4464e3d3250c01a35128c25bcd71b3ffc0168 Jobs 0 9 662 625 2020-11-17T11:15:49Z Mjh 2 /* Running code */ wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://docs.adaptivecomputing.com/torque/help.htm Torque] (formerly known as PBS) and [http://docs.adaptivecomputing.com/maui/index.php Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Basic commands == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/qsub.htm online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=64 job.sh ## run for two hours on 64 CPUs (which may be spread over up to 64 physical nodes) qsub -l walltime=1:0:0 -l nodes=2:ppn=32 job.sh ## run for one hour on 2 nodes with 32 CPUs per node qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes on one node for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. Do not attempt to use these options unless you know what you are doing. Note also that in the present configuration requests for fewer processors per node than are available may be consolidated onto fewer nodes. This should never be a problem unless you are specifically testing inter-node communication, in which case you could specify node names to force running on different nodes. For efficiency, you are recommended always to ask for and use the maximum number of processes per node for MPI or multi-threaded code -- that is, 32 processes for the main cluster. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). You can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qalter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. Look at the <tt>man</tt> pages for these commands for more information. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/2-jobs/exportedBatchEnvVar.htm here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/pbsdsh.htm pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> export OMP_NUM_THREADS=`cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. For a large array you have the option to limit the number of jobs that will run concurrently -- perhaps because they all want access to IO resources and will compete with each other and run out of walltime if they all run at once. So <pre> qsub -t 1-1000%20 myjob.qsub </pre> will run 1000 versions of the job but ensure that only 20 are running at a given time. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). a0fffdc451ec90ea23284adc5f16c2394cc93e42 675 662 2021-04-28T15:20:14Z Mjh 2 wikitext text/x-wiki The main mechanism for carrying out computing on the cluster is the '''batch queue system'''. This is based around [http://docs.adaptivecomputing.com/torque/help.htm Torque] (formerly known as PBS) and [http://docs.adaptivecomputing.com/maui/index.php Maui], which are free software and widely used in the HPC world, particularly in academia. Torque is the queueing system, which handles job submission, and Maui is the scheduler, which decides what jobs are run. Rather than running processes directly on the compute nodes, you submit a script which, in general, contains both directives to the queuing system and the commands needed to run your jobs. The scheduler will then decide, based on the available resources, whether your job can be run immediately or must be deferred till later. If other people's jobs are running, you are likely to have to wait. It is important therefore to request only the minimum resources you need: a job that only needs 8 nodes may be able to run immediately while a job that requests all nodes in the main cluster may need to wait for days. Once your job starts running the standard output and standard error can be saved or discarded, at your option. Most jobs will operate on files in one of the cluster-wide file systems (see [[storage]]). You can also arrange to have the system send you e-mail when your job starts and/or ends. You should enable [[passwordless ssh]] between nodes before starting to use the job submission system. Various parts of it and code that depends on it rely on being able to use ssh or scp. == Submission of jobs == The command to submit a job to the batch queue system is <tt>qsub</tt>. <tt>qsub</tt> takes one obligatory argument, the name of the script that will be run to start the job: <pre> qsub myjob.sh </pre> This must be a script: specifying a binary file will not work. Optionally, qsub can take a number of arguments that specify what resources the job requires and how it is to be handled by the queuing system. Alternatively, these can be specified in the script, tagged with lines beginning <tt>#PBS</tt> before any executable lines. For example, <pre> cat << END > myjob1.sh #!/bin/sh echo "Hello world" END qsub -N hello -m abe myjob1.sh </pre> and <pre> cat <<END > myjob2.sh #!/bin/sh #PBS -N hello #PBS -m abe echo "Hello world" END qsub myjob2.sh </pre> are functionally equivalent. It is often more useful to have the options specified in the script. You can also have a mixture: if the same option is specified on the <tt>qsub</tt> command line and in a script directive, the command line takes precedence. A full list of qsub options is given by <tt>man qsub</tt> or is available [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/qsub.htm online]. Some important ones are as follows: * -N: specify the name of the job as it will appear in the queue * -m: say when the system should e-mail you: <tt>-m abe</tt> means to e-mail on abort, begin and end. Before setting this flag, read the page on [[Mail]] and take appropriate action! * -k: specify whether the standard output or standard error streams of the job should be mirrored in your home directory. If you specify this, you will be able to see the output as it's generated, but it will appear in your home directory. If you don't, the output will be stored in the directory specified by -o and -e or in the current working directory of qsub if they are not specified, but it will only appear once the job is finished. * -j: specify whether the output and error streams should be kept separate or merged. * -o, -e: specify where standard output/error should be stored, if not in the current directory * -q: specify what [[queues|queue]] the job will be run on. * -t: start multiple jobs simultaneously (see below) * -W: for job inter-dependencies (see below) * -v: to set environment variables (see below) The most important option is <tt>-l</tt>, which specifies the resources to be used by the job. On our system the useful resources to specify are * <tt>walltime</tt>: the expected execution time in h:m:s format (if unspecified, the default walltime on all queues is 24 hours). * <tt>nodes</tt>: a list of requirements on the nodes to be allocated, or possibly more than one joined by a plus sign. * <tt>pmem</tt>: the physical [[memory]] requirements for your job. * <tt>file</tt>: the [[local disk space]] to be used by your job. For example, <pre> qsub -l walltime=2:0:0 -l nodes=64 job.sh ## run for two hours on 64 CPUs (which may be spread over up to 64 physical nodes) qsub -l walltime=1:0:0 -l nodes=2:ppn=32 job.sh ## run for one hour on 2 nodes with 32 CPUs per node qsub -l walltime=0:1:0 -l nodes=1:2 -l pmem=8gb job5.sh ## Run two processes on one node for one minute, requesting 8 Gb of memory each </pre> (In general you will not want to specify node or [[architecture|chassis]] names directly, as this will interfere with the efficiency of the scheduling. Do not attempt to use these options unless you know what you are doing. Note also that in the present configuration requests for fewer processors per node than are available may be consolidated onto fewer nodes. This should never be a problem unless you are specifically testing inter-node communication, in which case you could specify node names to force running on different nodes. For efficiency, you are recommended always to ask for and use the maximum number of processes per node for MPI or multi-threaded code -- that is, 32 processes for the main cluster. If you are running single-threaded code, you should ask for one processor per node. The defaults for all the other options are sensible but you ''must'' think carefully about the resources you need. Requesting more time or resources than you need will lead to inefficiency in scheduling your jobs: requesting less than you need (particularly in terms of walltime) is likely to mean that your job does not run to completion. A job that exceeds the walltime estimate will be terminated. ''A walltime should be specified: the default walltime (see [[Queues]]) is probably not what you want.'' == Viewing jobs in the queue == Once a job is submitted, you can view its progress with <tt>qstat</tt>: <pre> >qstat Job id Name User Time Use S Queue ------------------- ---------------- --------------- -------- - ----- 1765.stri-cluster HCG62-doublebeta mjh 00:00:00 R all 1766.stri-cluster ...59-doublebeta mjh 00:00:00 R all 1767.stri-cluster ...07-doublebeta mjh 00:00:00 R all 1768.stri-cluster ...83-doublebeta mjh 00:00:00 R all 1769.stri-cluster ...73-doublebeta mjh 00:00:00 R all 1770.stri-cluster ...61-doublebeta mjh 0 Q all 1771.stri-cluster ...36-doublebeta mjh 0 Q all 1772.stri-cluster ...44-doublebeta mjh 0 Q all 1773.stri-cluster ...29-doublebeta mjh 0 Q all 1774.stri-cluster ...33-doublebeta mjh 0 Q all 1775.stri-cluster ...46-doublebeta mjh 0 Q all >qstat -a stri-cluster.herts.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 1765.stri-cluster.he mjh all HCG62-doub 11125 16 -- -- 05:00 R 01:56 1766.stri-cluster.he mjh all IC1459-dou 10658 16 -- -- 05:00 R 01:56 1767.stri-cluster.he mjh all NGC3607-do 7355 16 -- -- 05:00 R 01:56 1768.stri-cluster.he mjh all NGC383-dou 12069 16 -- -- 05:00 R 01:56 1769.stri-cluster.he mjh all NGC4073-do 12793 16 -- -- 05:00 R 01:56 1770.stri-cluster.he mjh all NGC4261-do -- 16 -- -- 05:00 Q -- 1771.stri-cluster.he mjh all NGC4636-do -- 16 -- -- 05:00 Q -- 1772.stri-cluster.he mjh all NGC5044-do -- 16 -- -- 05:00 Q -- 1773.stri-cluster.he mjh all NGC5129-do -- 16 -- -- 05:00 Q -- 1774.stri-cluster.he mjh all NGC533-dou -- 16 -- -- 05:00 Q -- 1775.stri-cluster.he mjh all NGC5846-do -- 16 -- -- 05:00 Q -- </pre> The different options to qstat give different views of the queue. Here we see that some jobs are running (R) and some are queued (Q). You may also see jobs that have recently completed (C), jobs that are held pending some condition (H) and, hopefully rarely, jobs where there is some error in the scheduling system (E). == Changing and deleting jobs == You can delete jobs from the queue with the Torque command <tt>qdel</tt>, you can change your resource request with <tt>qalter</tt>, and you can move queued jobs between different queues with <tt>qmove</tt>. Look at the <tt>man</tt> pages for these commands for more information. == Running code == Your job will start running on ''one'' of the nodes it has been allocated. It is your script's responsibility to start your code running on any other nodes you have been allocated. At run time, environment variables contain information about the job. A full list is [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/2-jobs/exportedBatchEnvVar.htm here]. The crucial one is $PBS_NODEFILE, which provides a list of nodes on which the job should run. ''You must honour this node list.'' The [http://docs.adaptivecomputing.com/torque/2-5-12/help.htm#topics/commands/pbsdsh.htm pbsdsh] command can be used in your script to run a single program on all the allocated nodes, as in the following example: <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ pbsdsh hostname echo ------------------------------------------------------ echo Job ends </pre> pbsdsh would be suitable for starting a number of parallel processes which do not need to communicate, e.g. multiple identical Monte Carlo simulations. For scripts to start [[MPI]] processes in such a way that inter-process communication (IPC) works see the description of [[MPI]]. If you are running [[OpenMP]] code, which should only be running on a single node, then you need to make sure that the code does not grab all available CPUs. An easy way to make sure that you take only the CPUs that you have been allocated would be to put <pre> export OMP_NUM_THREADS=`cat $PBS_NODEFILE | wc -l` </pre> in the qsub script before the code runs. == Environment == In all cases your script must set the environment that you want your program to run in. Environment variables, aliases and working directory are not inherited from the shell that runs qsub. So you will often want to change to an appropriate working directory and to source some startup files in your qsub script before you actually execute any code. <pre> #!/bin/sh -f #PBS -N pbsdsh #PBS -m abe #PBS -l nodes=16:ppn=8 #PBS -l walltime=00:01:00 #PBS -k oe echo ------------------------------------------------------ echo -n 'Job is running on node '; cat $PBS_NODEFILE echo ------------------------------------------------------ echo PBS: qsub is running on $PBS_O_HOST echo PBS: originating queue is $PBS_O_QUEUE echo PBS: executing queue is $PBS_QUEUE echo PBS: working directory is $PBS_O_WORKDIR echo PBS: execution mode is $PBS_ENVIRONMENT echo PBS: job identifier is $PBS_JOBID echo PBS: job name is $PBS_JOBNAME echo PBS: node file is $PBS_NODEFILE echo PBS: current home directory is $PBS_O_HOME echo PBS: PATH = $PBS_O_PATH echo ------------------------------------------------------ cd /home/fred/my_working_directory export PATH=/home/fred/my_binaries:${PATH} /usr/local/bin/mpiexec my-mpi-code arg1 arg2 echo ------------------------------------------------------ echo Job ends </pre> Note that you can set environment variables in the environment of the running script with the <tt>-v</tt> option to <tt>qsub</tt>: <pre> qsub -v NAME=fred myjob.sh </pre> The value of <tt>$NAME</tt> is then available to the script. This is useful if you want to write a generic script and control it using a parameter, rather than editing the script. == Multiple job submission == The <tt>-t</tt> option to qsub allows you to create multiple jobs simultaneously. These will all run the same script but will differ in the value of the environment variable <tt>$PBS_ARRAYID</tt>. Your script should normally use this variable to decide how to behave differently. For example, multiple runs of a Monte Carlo simulation might use it to store the output in different files. The jobs will all be submitted to the queue immediately and may run concurrently, depending on the resources they request, so it is important to make sure that it is safe to do this. A common mistake is to use the -t option wrongly. <pre> qsub -t 4 myjob.qsub </pre> will only start ''one'' job, with $PBS_ARRAYID set to 4. <pre> qsub -t 1-4 myjob.qsub </pre> will queue four jobs, with $PBS_ARRAYID ranging from 1 to 4. For a large array you have the option to limit the number of jobs that will run concurrently -- perhaps because they all want access to IO resources and will compete with each other and run out of walltime if they all run at once. So <pre> qsub -t 1-1000%20 myjob.qsub </pre> will run 1000 versions of the job but ensure that only 20 are running at a given time. == Interactive jobs == If you need to access nodes interactively, see the separate page on [[interactive jobs]]. == Jobs that depend on other jobs == You may wish to submit jobs that depend on each other: for example, to submit a job that only runs after another job has finished. (This allows you to stack up very long computing requests without violating the maximum walltime constraints.) Look at <tt>man qsub</tt> for the full details, but for a simple dependence use <pre> qsub -W depend=afterany:123456 myjob.qsub </pre> which will cause your job to be run only after job 123456 has finished (no matter what its output status). 28d044a865991d8cebe5bff176e8de3101fa9846 Storage 0 8 676 661 2021-05-01T08:50:24Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 334 Tb of scratch for CAR users only, mounted as /car-data * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/ralf : Ralf Napiwotzki * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 1.9 PB of beegfs storage nominally distributed as follows: * 497 TB: general use, under /beegfs/general * 553 TB: CAR, under /beegfs/car * 421 TB: LOFAR-UK, under /beegfs/lofar * 298 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). c46f4b1207436bc713a009f217c74757b528fa08 686 676 2021-10-01T12:52:50Z Mjh 2 /* System-wide NFS storage */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 1.9 PB of beegfs storage nominally distributed as follows: * 497 TB: general use, under /beegfs/general * 553 TB: CAR, under /beegfs/car * 421 TB: LOFAR-UK, under /beegfs/lofar * 298 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 39133210cab102c9f35274fb7c2aa56867aebfbe 688 686 2021-11-04T08:33:34Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 1.9 PB of beegfs storage nominally distributed as follows: * 497 TB: general use, under /beegfs/general * 853 TB: CAR, under /beegfs/car * 421 TB: LOFAR-UK, under /beegfs/lofar * 298 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 710edcfc5724b978bfb49f6366b8af3e954e922f 689 688 2021-11-05T12:31:31Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 2.2 PB of beegfs storage nominally distributed as follows: * 486 TB: general use, under /beegfs/general * 831 TB: CAR, under /beegfs/car * 500 TB: LOFAR-UK, under /beegfs/lofar * 380 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). a0f0ec9bf91391dc63c7774f2c6807429e422700 695 689 2022-02-14T16:49:56Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 2.2 PB of beegfs storage nominally distributed as follows: * 486 TB: general use, under /beegfs/general * 1122 TB: CAR, under /beegfs/car * 500 TB: LOFAR-UK, under /beegfs/lofar * 671 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). 1de52a560c803c76d821c79a3bd01cea7add99a1 696 695 2022-02-14T16:50:33Z Mjh 2 /* Distributed file system */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 3 Tb of user home directories, mounted as /home * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 2.8 PB of beegfs storage nominally distributed as follows: * 486 TB: general use, under /beegfs/general * 1122 TB: CAR, under /beegfs/car * 500 TB: LOFAR-UK, under /beegfs/lofar * 671 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home is backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file we ''may'' have a useful backup (ask straight away). e932cd1b569724339dcd852863d3ba99dfa083d3 704 696 2022-06-26T08:44:28Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use /home for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username (unless you are a member of a special group with dedicated storage, see below) to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 6.5 Tb of user home directories, mounted as /home and /home2 * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 3.8 PB of beegfs storage nominally distributed as follows: * 486 TB: general use, under /beegfs/general * 1122 TB: CAR, under /beegfs/car * 500 TB: LOFAR-UK, under /beegfs/lofar * 671 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home and /home2 are backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file from your home directory we ''may'' have a useful backup (ask straight away). bcaf61bdb03b3d716a43e58c7aa32d6a4647479b 705 704 2022-06-27T07:49:59Z Mjh 2 /* Overview */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use their home directories (on /home and /home2) for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username (unless you are a member of a special group with dedicated storage, see below) to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 6.5 Tb of user home directories, mounted as /home and /home2 * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 3.8 PB of beegfs storage nominally distributed as follows: * 486 TB: general use, under /beegfs/general * 1122 TB: CAR, under /beegfs/car * 500 TB: LOFAR-UK, under /beegfs/lofar * 671 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == The four [[SMP machines]] have their own local scratch discs: see the relevant page for more information. These are mounted system-wide as /smp1 .. /smp4 Nodes also have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home and /home2 are backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file from your home directory we ''may'' have a useful backup (ask straight away). aa468efddc5f0691a5643e757d3b4cc3154b7a92 707 705 2022-07-21T19:16:47Z Mjh 2 /* Disks local to machines */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use their home directories (on /home and /home2) for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username (unless you are a member of a special group with dedicated storage, see below) to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 6.5 Tb of user home directories, mounted as /home and /home2 * Software directory /soft * 58 Tb of scratch for CAIR users only, mounted as /cair-scratch * 77 Tb of scratch for CAIR users only, mounted as /cair-work In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 3.8 PB of beegfs storage nominally distributed as follows: * 486 TB: general use, under /beegfs/general * 1122 TB: CAR, under /beegfs/car * 500 TB: LOFAR-UK, under /beegfs/lofar * 671 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == Nodes have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home and /home2 are backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file from your home directory we ''may'' have a useful backup (ask straight away). 3b074c4e02fa5a6edb759b9acba4ab4a7a88f27a Reservations 0 46 677 326 2021-05-27T11:16:19Z Mjh 2 wikitext text/x-wiki It is possible to request reservations of one or more nodes at fixed times in the future. If you have reserved a node, no other jobs than yours will be able to run on it. You can request a reservation from the [[administrators]] or, if you have a particular need to do this often, we can set things up so that you can create your own reservations automatically. Please only ask for a reservation if you believe that there is no method of doing what you want within the standard [[Jobs|queueing system]]. Reservations will usually be for a group of people, but may be for an individual. If you need to use a group reservation, you will need to know the name of the group in question, and you will need to belong to that group. Typing <tt>groups</tt> at a shell prompt on the head node will tell you what groups you belong to. General guidelines for reservations are as follows: * If creating a reservation yourself, reserve the machine(s) for a period when no existing jobs will be running -- use <tt>qstat</tt> to determine when this will be. * If you are using a personal reservation, use the reservation by submitting a job as normal. A reservation available to you will be used automatically if it is available. If you need an interactive session, use an [[interactive jobs|interactive job]]; e.g. if you have reserved smp2 for two days, do <tt>qsub -I -q smp -l nodes=smp2:ppn=48 -l walltime=48:00:00</tt>. * If you are using a group reservation, specify that you want to use it by adding the option <tt>-W group_list=[groupname]</tt> to the <tt>qsub</tt> command or script. E.g. to use 8 cores of the <tt>scuba2</tt> group reservation on smp1 interactively, do <tt>qsub -W group_list=scuba2 -q smp -l nodes=smp1:ppn=8 -I</tt>. Again, the reservation will be used if the resources are available, and your job will otherwise go into the general pool. * If you no longer need a reservation, e-mail the administrators to ask them to delete it. d20156d8eb08a8ed1c15f214cb0ccbd642cba767 Cluster bibliography 0 30 678 608 2021-06-02T10:54:53Z Asinha 12 Add Sinha et al. 2021 to list wikitext text/x-wiki Please add details of any papers written using the cluster here. This allows us to keep track of the productivity of the cluster. You can use any format you like so long as you identify the authors, the title (this helps us keep track of what different areas of research are being carried out with the cluster), and an indication of the bibliographic details and date. * Ankur Sinha, Christoph Metzner, Neil Davey, Roderick Adams, Michael Schmuker, and Volker Steuber. Growth rules for the repair of asynchronous irregular neuronal networks after peripheral lesions. '''PLOS Computational Biology''', 17(6):1–35, '''2021'''. URL: https://doi.org/10.1371/journal.pcbi.1008996, doi:10.1371/journal.pcbi.1008996. * Patel, H., & Kukol, A. ('''2019'''). Prediction of ligands to universally conserved binding sites of the influenza A virus nuclear export protein. ''Virology'', 537, 97-103. * Taylor, P., Kobayashi, C., Federrath, C. The metallicity and elemental abundance maps of kinematically atypical galaxies for constraining minor merger and accretion histories. '''2019'''. MNRAS, 485, 3215 * Wang, E.X., Taylor, P., Federrath, C., Kobayashi, C. The impact of black hole seeding in cosmological simulations. '''2019'''. MNRAS, 483, 4640 * Taylor, P., Federrath, C., Kobayashi, C. The origin of kinematically distinct cores and misaligned gas discs in galaxies from cosmological simulations. '''2018'''. MNRAS, 479, 141 * Taylor, P., Kobayashi, C. The metallicity and elemental abundance gradients of simulated galaxies and their environmental dependence. '''2017'''. MNRAS, 471, 3856 * Taylor, P., Federrath, C., Kobayashi, C. Star formation in simulated galaxies: understanding the transition to quiescence at 3x10^10 solar masses. '''2017'''. MNRAS, 469, 4249 *Taylor, P., Kobayashi, C. Time evolution of galaxy scaling relations in cosmological simulations. '''2016'''. MNRAS, 463, 2465 *Taylor, P., Kobayashi, C. Quantifying AGN-driven metal-enhanced outflows in chemodynamical simulations. '''2015'''. MNRAS, 452L, 59 *Taylor, P., Kobayashi, C. The effects of AGN feedback on present-day galaxy properties in cosmological simulations. '''2015'''. MNRAS, 448, 1835 *Taylor, P., Kobayashi, C. Seeding black holes in cosmological simulations. '''2014'''. MNRAS, 442, 2751 * Stotz, H., Pascoe, H., Parham, H., Fitt, B., Mashanova, A., Kukol, A., ... & Hossein, B. ('''2018'''). Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae. ''PLoS ONE'', 13(06), 1-17. [e0198201]. DOI: 10.1371/journal.pone.0198201 * Patel, H., & Kukol, A. ('''2017'''). Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors. ''Virology'', 509, 112-120. DOI: 10.1016/j.virol.2017.06.009 * Patel, H., & Kukol, A. ('''2016'''). Evaluation of a novel virtual screening strategy using receptor decoy binding sites. ''J Negative Results in BioMedicine'', 15(15). * Kukol A, Hughes DJ ('''2014'''). Large-scale analysis of Influenza A virus nucleoprotein sequence conservation reveals potential drug-target sites, ''Virology'', 454/55: 40-47. * Poojari, C, Kukol, A, Strodel, B ('''2013'''). How the amyloid-β peptide and membranes affect each other: An extensive simulation study, ''Biochimica et Biophysica Acta - Biomembranes'', 1828(2), 327-339. * Steuernagel O., Kakofengitis D., Ritter G., Wigner flow reveals topological order in quantum phase space dynamics, '''2012''', accepted by ''PRL''. * Hardcastle MJ, Krause MGH, Numerical modelling of the lobes of radio galaxies in cluster environments, '''2012''', submitted to ''MNRAS'' * Kalia M, Kukol A ('''2011'''.) Structure and dynamics of the kinase IKK-β – a key regulator of the NF-kappa B transcription factor, ''J Struct Biol'', 176(2), 133-142. * Kukol A ('''2011'''). Consensus virtual screening approaches to predict protein ligands, ''Eur J Med Chem'', 46(9), 4661-4664. * Hardcastle MJ, Croston JH, Modelling TeV gamma-ray emission from the kiloparsec-scale jets of Centaurus A and M87, '''2011''', ''MNRAS'' 415 433 * Goodger JL, Croston JH, Hardcastle MJ, The influence of radio-loud AGN on groups of galaxies: Paper I - the X-ray luminosity-temperature relationship, submitted to ''MNRAS'', May 2011. * de Sousa G, Maex R, Adams R, Davey N, Steuber V, Optimization of neuronal morphologies for pattern recognition, published at ''BMC Neuroscience'' '''2010''' * Karen Safaryan, Reinoud Maex, Rod Adams, Neil Davey, and Volker Steuber The Effect of Non-Specific LTD on Pattern Recognition in Cerebellar Purkinje Cells, published at ''BMC Neuroscience'' '''11''', P92 69d126c666d194541631c628dabfdeb83c9d0fdc Administrators 0 6 679 660 2021-07-13T09:05:57Z Mayaahorton 17 /* Administrators */ wikitext text/x-wiki == Administrators == UH users wanting help or support with UHHPC should ask for it through the UH helpdesk (e-mail helpdesk@herts.ac.uk or log in to [https://helpdesk.herts.ac.uk/ the web interface]). Please make sure that your e-mail or help request includes 'UHHPC' in the subject line to make sure that it is routed through to the dedicated cluster support team. Failure to do so can result in long delays. For account requests, please provide your contact details and department or working group as specified on the [[accounts]] page. External users, such as external consortium users, should send e-mail to Martin Hardcastle (m.j.hardcastle@herts.ac.uk). 3d73dca4162988faceea34760541a971e318ce7c 699 679 2022-05-06T10:27:54Z Mayaahorton 17 /* Administrators */ wikitext text/x-wiki == Administrators == UH users wanting help or support with UHHPC should ask for it through the UH helpdesk (e-mail helpdesk@herts.ac.uk or log in to [https://helpdesk.herts.ac.uk/ the web interface]). Please make sure that your e-mail or help request includes 'UHHPC' in the subject line to make sure that it is routed through to the dedicated cluster support team. Failure to do so can result in long delays. For account requests, please provide your contact details and department or working group as specified on the [[accounts]] page. Sending technical support requests to individual staff members is unlikely to get a response. Helpdesk access is restricted to those who have current UH login details. External users, such as external consortium users, should ask their UH collaborators to submit helpdesk requests where possible. If this is not possible you can send e-mail directly to Martin Hardcastle (m.j.hardcastle@herts.ac.uk). 816ffd9be9eacf0738f883732a4993ca4fc7fed2 700 699 2022-05-06T10:28:13Z Mayaahorton 17 /* Administrators */ wikitext text/x-wiki == Administrators == UH users wanting help or support with UHHPC should ask for it through the UH helpdesk (e-mail helpdesk@herts.ac.uk or log in to [https://helpdesk.herts.ac.uk/ the web interface]). Please make sure that your e-mail or help request includes 'UHHPC' in the subject line to make sure that it is routed through to the dedicated cluster support team. Failure to do so can result in long delays. For account requests, please provide your contact details and department or working group as specified on the [[accounts]] page. Sending technical support requests to individual staff members is unlikely to get a response. Helpdesk access is unfortunately restricted to those who have current UH login details. External users, such as external consortium users, should ask their UH collaborators to submit helpdesk requests where possible. If this is not possible you can send e-mail directly to Martin Hardcastle (m.j.hardcastle@herts.ac.uk). 2db67aa2ab8651d5f01826c1ebeead48d848e37e Read this first 0 70 680 514 2021-07-15T16:02:35Z Mayaahorton 17 /* Introduction to cluster computing */ wikitext text/x-wiki = Introduction to cluster computing = If you are new to the concept of cluster computing, read this '''before doing anything else'''. Cluster computing is fundamentally different from personal computing. If you do not understand how it works, you may cause problems for other users, or break [[Policies|cluster rules]]. The cluster is composed of 'nodes' which are individual computers, joined together by a network. Nodes have different roles. Specifically there are a few 'login nodes' or 'head nodes' (what you see when you log in to the cluster), and many (about 150) [[Architecture|compute nodes]], which are where the actual computing work is done. When you first log in, you will end up in your home directory and will have the ability to create and store folders as well as upload, move and download data using standard Linux commands or a repository such as Github. However, it is very important that you do not run scripts or large data processing tasks in this initial environment because you may slow down the whole cluster. Rather, you write a script to run a '[[Jobs|job]]' on the compute nodes (or alternatively submit an '[[Interactive job|Interactive_jobs]]'). Your script tells the cluster both what you want to do, and what resources you need to do it. Your job is then run on one or more compute nodes that match your requirements. Any code or task that would take more than a minute or two to execute needs to be submitted to the job scheduler. The [[administrators]] can, and often have to, terminate jobs that are running on the head nodes. If your code is capable of doing so, this model allows you to access many nodes simultaneously, and so to get much greater resources than your desktop or laptop computer can provide. But you need to understand it to use it. New users should read '''at least''' the following Wiki pages: * [[Accounts]] -- to find out how to get an account * [[Access]] -- to find out how to get access to the cluster * [[Architecture]] -- to find out what nodes there are * [[Jobs]] -- to find out how to run jobs on appropriate compute nodes * [[Queues]] -- to understand which queue to use * [[Storage]] -- to understand how and where to store data on the cluster Please don't approach the [[administrators]] for help until you have read and understood these pages. a55a807b66be0db805d94b4178cf4c254c1714ad 681 680 2021-07-15T16:03:10Z Mayaahorton 17 /* Introduction to cluster computing */ wikitext text/x-wiki = Introduction to cluster computing = If you are new to the concept of cluster computing, read this '''before doing anything else'''. Cluster computing is fundamentally different from personal computing. If you do not understand how it works, you may cause problems for other users, or break [[Policies|cluster rules]]. The cluster is composed of 'nodes' which are individual computers, joined together by a network. Nodes have different roles. Specifically there are a few 'login nodes' or 'head nodes' (what you see when you log in to the cluster), and many (about 150) [[Architecture|compute nodes]], which are where the actual computing work is done. When you first log in, you will end up in your home directory and will have the ability to create and store folders as well as upload, move and download data using standard Linux commands or a repository such as Github. However, it is very important that you do not run scripts or large data processing tasks in this initial environment because you may slow down the whole cluster. Rather, you write a script to run a '[[Jobs|job]]' on the compute nodes (or alternatively submit an '[[Interactive _jobs|interactive job]]'). Your script tells the cluster both what you want to do, and what resources you need to do it. Your job is then run on one or more compute nodes that match your requirements. Any code or task that would take more than a minute or two to execute needs to be submitted to the job scheduler. The [[administrators]] can, and often have to, terminate jobs that are running on the head nodes. If your code is capable of doing so, this model allows you to access many nodes simultaneously, and so to get much greater resources than your desktop or laptop computer can provide. But you need to understand it to use it. New users should read '''at least''' the following Wiki pages: * [[Accounts]] -- to find out how to get an account * [[Access]] -- to find out how to get access to the cluster * [[Architecture]] -- to find out what nodes there are * [[Jobs]] -- to find out how to run jobs on appropriate compute nodes * [[Queues]] -- to understand which queue to use * [[Storage]] -- to understand how and where to store data on the cluster Please don't approach the [[administrators]] for help until you have read and understood these pages. 4c578d4ca116bc13bed17f51b663953fdb9de279 682 681 2021-07-15T16:05:17Z Mayaahorton 17 /* Introduction to cluster computing */ wikitext text/x-wiki = Introduction to cluster computing = If you are new to the concept of cluster computing, read this '''before doing anything else'''. Cluster computing is fundamentally different from personal computing. If you do not understand how it works, you may cause problems for other users, or break [[Policies|cluster rules]]. The cluster is composed of 'nodes' which are individual computers, joined together by a network. Nodes have different roles. Specifically there are a few 'login nodes' or 'head nodes' (what you see when you log in to the cluster), and many (about 150) [[Architecture|compute nodes]], which are where the actual computing work is done. When you first log in, you will end up in your home directory and will have the ability to create and store folders as well as upload, move and download data using standard Linux commands or connect to a repository such as Github. However, it is very important that you do not run scripts or large data processing tasks in this initial environment because you may slow down the whole cluster. Rather, you write a script to run a '[[Jobs|job]]' on the compute nodes (or alternatively submit an '[[Interactive _jobs|interactive job]]'). Your script tells the cluster both what you want to do, and what resources you need to do it. Your job is then run on one or more compute nodes that match your requirements. Any code or task that would take more than a minute or two to execute needs to be submitted to the job scheduler. The [[administrators]] can, and often have to, terminate jobs that are running on the head nodes. If your code is capable of doing so, this model allows you to access many nodes simultaneously, and so to get much greater resources than your desktop or laptop computer can provide. But you need to understand it to use it. New users should read '''at least''' the following Wiki pages: * [[Accounts]] -- to find out how to get an account * [[Access]] -- to find out how to get access to the cluster * [[Architecture]] -- to find out what nodes there are * [[Jobs]] -- to find out how to run jobs on appropriate compute nodes * [[Queues]] -- to understand which queue to use * [[Storage]] -- to understand how and where to store data on the cluster Please don't approach the [[administrators]] for help until you have read and understood these pages. fb28f2979fda4455df4358d260abbf44dcb8a3b9 683 682 2021-07-15T16:06:35Z Mayaahorton 17 /* Introduction to cluster computing */ wikitext text/x-wiki = Introduction to cluster computing = If you are new to the concept of cluster computing, read this '''before doing anything else'''. Cluster computing is fundamentally different from personal computing. If you do not understand how it works, you may cause problems for yourself or other users, or break [[Policies|cluster rules]]. The cluster is composed of 'nodes' which are individual computers, joined together by a network. Nodes have different roles. Specifically there are a few 'login nodes' or 'head nodes' (what you see when you log in to the cluster), and many (about 150) [[Architecture|compute nodes]], which are where the actual computing work is done. When you first log in, you will end up in your home directory and will have the ability to create and store folders as well as upload, move and download data using standard Linux commands or connect to a repository such as Github. However, it is very important that you do not run scripts or large data processing tasks in this initial environment because you may slow down the whole cluster. Rather, you write a script to run a '[[Jobs|job]]' on the compute nodes (or alternatively submit an '[[Interactive _jobs|interactive job]]'). Your script tells the cluster both what you want to do, and what resources you need to do it. Your job is then run on one or more compute nodes that match your requirements. Any code or task that would take more than a minute or two to execute needs to be submitted to the job scheduler. The [[administrators]] can, and often have to, terminate jobs that are running on the head nodes. If your code is capable of doing so, this model allows you to access many nodes simultaneously, and so to get much greater resources than your desktop or laptop computer can provide. But you need to understand it to use it. New users should read '''at least''' the following Wiki pages: * [[Accounts]] -- to find out how to get an account * [[Access]] -- to find out how to get access to the cluster * [[Architecture]] -- to find out what nodes there are * [[Jobs]] -- to find out how to run jobs on appropriate compute nodes * [[Queues]] -- to understand which queue to use * [[Storage]] -- to understand how and where to store data on the cluster Please don't approach the [[administrators]] for help until you have read and understood these pages. 088b4057878d092657895e4cd800a2961d02231f Python packages 0 49 684 658 2021-08-10T20:33:26Z Mjh 2 /* Python virtual environments */ wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * numpy * scipy * astropy * tensorflow * h5py * mpi4py Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these and many more available. You can install your own local copies of packages that don't exist on the system using <tt>pip install --user PACKAGENAME</tt>. These will appear in your <tt>.local</tt> directory. However, please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to include <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>, and make sure no Python 2 packages are on your PYTHONPATH. If you want Ipython3 add <tt>/soft/python3/usr/local/bin</tt> to your PATH. <tt>module load python3</tt> will make these changes for you. <tt>pip3</tt> can be used to install local copies of python3 packages. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 and bash assumed -- for tcsh use unsetenv not unset): <pre> unset PYTHONPATH python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip (without <tt>--user</tt> option). For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-10.1 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. f7ca739684dc82a4f558aee4433816a0c8d24235 685 684 2021-09-22T13:08:38Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * numpy * scipy * astropy * tensorflow * h5py * mpi4py Set your PYTHONPATH to <tt>/soft/python/lib64/python2.7/site-packages</tt> to make these and many more available. You can install your own local copies of packages that don't exist on the system using <tt>pip install --user PACKAGENAME</tt>. These will appear in your <tt>.local</tt> directory. However, please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python 3.6 == A separate Python 3.6 installation is available. To use this, just run python36 or python3.6 . If you want to use system-wide installations of packages you will need to set your PYTHONPATH to include <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt>, and make sure no Python 2 packages are on your PYTHONPATH. If you want Ipython3 add <tt>/soft/python3/usr/local/bin</tt> to your PATH. <tt>module load python3</tt> will make these changes for you. <tt>pip3</tt> can be used to install local copies of python3 packages. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 and bash assumed -- for tcsh use unsetenv not unset): <pre> unset PYTHONPATH python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip (without <tt>--user</tt> option). For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-11.4 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. 254b2f611df24d9b03490e8bb51a830e2eb2f419 Interactive jobs 0 35 687 499 2021-10-22T17:02:31Z Mjh 2 /* Multiple CPUs */ wikitext text/x-wiki Interactive jobs are jobs which give you an interactive session on one of the compute nodes. Importantly, accessing the compute nodes this way means that the job control system guarantees the resources that you have asked for. If you simply log into a compute node with ssh (which is in any case forbidden by the cluster access [[policies]]), another user's job may compete with what you are trying to do. Therefore you should, unless explicitly authorized otherwise, always use the interactive job facility to run interactively on the compute nodes. == Running an interactive job == An interactive job is run using the <tt>qsub</tt> command as for normal [[jobs]]. However, the result of running an interactive job is that you are logged on to the target machine. For example, <pre> [user@headnode1 ~]$ qsub -l walltime=00:30:00 -I -q main qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@node047 ~]$ </pre> In this example the user has requested a 30-minute session on any node in the main cluster (the default request is one processor on one node). A node is available to run the job and so the user finds herself at a shell prompt. She may now use the node interactively for 30 minutes. At the end of the 30 minutes (wall time, not CPU time!) the job will end and the session will be terminated. If the user logs out before then, the job will terminate early. Note that, just as with ordinary jobs, you must honour the allocation you are given. If you ask for one CPU on one node, you must only use that. In the interactive shell, a number of environment variables beginning with PBS (try <tt>printenv | grep PBS</tt>) are provided to tell you the job environment in which you are running in case you have forgotten. If your request for an interactive shell cannot be fulfilled (because there are insufficient nodes available in the queue that you have specified), the qsub command will wait until it can be. == Advanced topics == === Multiple CPUs === If you know you want a machine completely dedicated to you (e.g. because you plan to run multithreaded code interactively) then you must explicitly request that: e.g., <pre> qsub -l walltime=24:00:00 -l nodes=1:ppn=32 -I -q smp </pre> will reserve all 32 cores of one of the [[SMP machines]] for you for a day. === Multiple nodes === In the unlikely situation in which you want to run interactively on several nodes at once, you will only be given one shell. You can use the <tt>pbsdsh</tt> command to run the same commands on each of the processors you have been allocated, or /usr/local/bin/mpiexec to run [[MPI]] jobs. <pre> qsub -l walltime=00:01:00 -l nodes=2:ppn=2 -I -q smp qsub: waiting for job 123456.stri-cluster.herts.ac.uk to start qsub: job 123456.stri-cluster.herts.ac.uk ready [user@smp2 ~]$ pbsdsh hostname smp2 smp1 smp1 smp2 </pre> === Specific machines === It is possible to request a specific machine just as for normal non-interactive [[jobs]]: <pre> qsub -l walltime=00:01:00 -l nodes=smp2:ppn=48 -I -q smp </pre> Do this if you know that you need to use some particular feature of the machine, such as the [[SMP machines]]' scratch discs. === X forwarding === If you want to run jobs that use the X windows system, you may turn on X11 forwarding using the <tt>-X</tt> option. (This is like the <tt>-X</tt> option to <tt>ssh</tt>.) === Walltime requests === Please be considerate in your walltime requests. While you have an interactive job running, no other user can use the resources you have allocated. Therefore, you should not set an interactive job running for a week and walk away. Jobs that appear to be reserving resources without using them will be terminated. Obviously the best strategy is to select a walltime that roughly matches what you want and exit before the time is up. 7997715999e267fbb32d964f429a08c55b9fa434 Software 0 17 690 642 2021-12-03T16:25:03Z Mayaahorton 17 /* Astronomical software */ wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2018b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> * [[DS9]]: in /soft/bin/ds9 = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA</tt> * [[Gromacs]]: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> e4dd59107acbaa374090ed32a496af520ba9ad9c 715 690 2023-04-29T09:49:32Z Mjh 2 /* Programming languages and development environments */ wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt>, <tt>gcc 10.4</tt> or <tt>gcc 13.1</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2022b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> * [[DS9]]: in /soft/bin/ds9 = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA</tt> * [[Gromacs]]: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> b992d675fce320a7e33ed297cecb1b0ad65db259 716 715 2023-04-29T09:50:08Z Mjh 2 /* Containerization */ wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt>, <tt>gcc 10.4</tt> or <tt>gcc 13.1</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2022b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> * [[DS9]]: in /soft/bin/ds9 = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA</tt> * [[Gromacs]]: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> or <tt>module load singularity...</tt> be8620907384a48a792b44aba4ef2a60eb015b87 Main Page 0 1 691 552 2021-12-03T16:28:12Z Mayaahorton 17 /* How-Tos */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Terms of use]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[GPUs]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] * [[Galaxy|How to use Galaxy on the cluster]] == Known problems == * [[Known problems]] 20be339cace77f930524a3d204d232d1c44b908b 701 691 2022-05-06T10:29:00Z Mayaahorton 17 /* Troubleshooting */ wikitext text/x-wiki == Welcome to the cluster documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Terms of use]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[GPUs]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Start here]] * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] * [[Galaxy|How to use Galaxy on the cluster]] == Known problems == * [[Known problems]] ce9d54736e4f6434c66a42f82a281977d0f49c4f 725 701 2023-09-26T10:32:52Z Mjh 2 wikitext text/x-wiki == Welcome to the UHHPC documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Terms of use]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[GPUs]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Start here]] * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] * [[Galaxy|How to use Galaxy on the cluster]] == Known problems == * [[Known problems]] e0de67892d4e7c22fcd6841c09676ae45012d084 Galaxy 0 86 692 2021-12-03T16:28:21Z Mayaahorton 17 Created page with "Coming soon." wikitext text/x-wiki Coming soon. 4b112f37651048ba2ea49ff06d2785674491b2b3 PLUTO 0 87 693 2021-12-03T17:06:41Z Mayaahorton 17 Created page with "PLUTO is used for astrophysical fluid dynamics and other applications. It is widely used on the cluster but is sensitive to MPI issues and has different versions. You will nee..." wikitext text/x-wiki PLUTO is used for astrophysical fluid dynamics and other applications. It is widely used on the cluster but is sensitive to MPI issues and has different versions. You will need to download and install the version that best matches your needs. It can be freely downloaded [[http://plutocode.ph.unito.it|from here]] and includes extensive documentation. You may belong to a research group which uses a modified version. Whilst the software can take time to master, you are advised to read the documentation and try out some of the test problems. This is particularly true if you are also new to cluster computing -- many test problems can be run quickly on a laptop, allowing you to become familiar with the setup. Of course, large problems will eventually require the use of the cluster. Many, but not all, PLUTO problems will require three files to run: init.c, definitions.h and pluto.ini. These are typically used to set up grid settings, define variables and store the actual models required for your problem. Once these are set up you will need to generate and run a makefile as outlined in the documentation. 13f430974bfec4b720e6d827bee901b5d411b901 Queues 0 15 694 581 2022-02-14T14:49:34Z Mayaahorton 17 wikitext text/x-wiki There are eight possible job queues available for general use on the system: * 'main' is the default queue: this submits to the main cluster. The maximum wall time on this queue is 1 week. * 'gpu' submits to the GPU nodes. The maximum wall time on this queue is 48 hours. * 'large' submits to 64 core nodes. The maximum wall time on this queue is 1 week, but you may require permission if your job requires a high number of nodes. * 'test' submits to 96 core nodes. Currently, access is limited to those requiring a high number of CPUs; if you don't already have access please contact a member of the team to discuss your needs. The maximum wall time is 1 week. * 'smp' submits to the [[SMP machines]]. It is not currently possible to run MPI jobs that span the SMP machines and the main or CAIR clusters. The maximum wall time for this queue is 48 hours. * 'cair_l' submits to the dedicated CAIR nodes. This queue is restricted to CAIR users. * 'car' submits to the dedicated CAR nodes. This queue is restricted to CAR users. The maximum wall time for this queue is 1 week. * 'forecast' submits to the dedicated air quality forecast nodes. == Default wall times == The default wall time for all the queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. b1634b00c6ef6f788940a484453fac979aed182f 710 694 2022-08-15T10:27:10Z Mjh 2 wikitext text/x-wiki There are five possible job queues available for general use on the system: * 'main' is the default queue: this submits to the main cluster. The maximum wall time on this queue is 1 week. * 'gpu' submits to the GPU nodes. The maximum wall time on this queue is 48 hours. * 'large' submits to 64 core nodes. The maximum wall time on this queue is 1 week, but you may require permission if your job requires a high number of nodes. * 'test' submits to 96 core nodes. Currently, access is limited to those requiring a high number of CPUs; if you don't already have access please contact a member of the team to discuss your needs. The maximum wall time is 1 week. * 'forecast' submits to the dedicated air quality forecast nodes. == Default wall times == The default wall time for all the queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. a5e3f479f4d21608d469b8005b88cadf37e0e02f Web server 0 32 697 289 2022-03-22T11:16:22Z Mjh 2 wikitext text/x-wiki The web server <tt>http://uhhpc.herts.ac.uk/</tt> is visible inside and outside the university. If you create a directory <tt>public_html</tt> in your home directory, then its contents are visible at <tt>http://uhhpc.herts.ac.uk/~your-username/</tt>. You can use this to export data; for large datasets, use symbolic links to /beegfs . Do not rely on the long-term existence of this facility (e.g. you should not use the cluster to host your personal home page). 935bc154a2265b637acf893fe8ac9312d1de4bfc Matlab 0 88 698 2022-05-02T09:27:42Z Mayaahorton 17 Created page with "At the present time, the current working version of MATLAB on the cluster can be accessed using /soft/MATLAB/2018b/bin/matlab. MATLAB on the cluster works best when scripted b..." wikitext text/x-wiki At the present time, the current working version of MATLAB on the cluster can be accessed using /soft/MATLAB/2018b/bin/matlab. MATLAB on the cluster works best when scripted but if you really need a GUI you can set up -X forwarding, though for everyday computation there is little improvement in using the cluster over desktop PCs on campus. The only exception would be for heavy computation and simulations, which will likely be slowed down considerably by -X forwarding. If you need a different version for a specific reason please contact us. ce043684ef176fe961fd4988d34f4464821c2716 Start here 0 89 702 2022-05-06T11:10:01Z Mayaahorton 17 Created page with "==Overview== High Performance Computing environments are often built on heterogeneous architecture, meaning that no two clusters are alike. Code that runs on one HPC system i..." wikitext text/x-wiki ==Overview== High Performance Computing environments are often built on heterogeneous architecture, meaning that no two clusters are alike. Code that runs on one HPC system is not guaranteed to work on another. Even within the UHHPC, some machines are designed for certain tasks and have software and filesystems mounted that are not available from elsewhere in the cluster. When code doesn't run, it is tempting to think that the cluster is broken. More often, though, there is a coding problem rather than a hardware problem. The UHHPC admin team manages more than 400 users across approximately 20 different research groups and departments. We work with external partners at dozens of institutions worldwide. The cluster runs thousands of specialised research software, modules and packages. The UHHPC is a shared research facility and users are responsible for their own research and experimental design. Our primary goal is the maintenance and development of the physical infrastructure. We are a very small team and such cannot help optimise experiments or debug code. Except in rare circumstances, we cannot compile or recompile code for you (if you really need this, please talk to us first). Most errors are not caused by hardware malfunction (although this does happen occasionally, particularly after power outages). It can be very difficult to know whether a problem is caused by a cluster issue which needs to be reported to the helpdesk, or a code issue which you could solve yourself. The following sections are designed to give a quick look at the four main classes of scripting error and some steps you can take to try to resolve them yourself. (Sections coming soon) ==Job scheduling problems== UHHPC runs on a PBS-based job scheduler called Torque and Maui. You can submit jobs interactively or through a submission script. A list of common scheduler errors is given below (coming soon), including resource request errors. ==Local environment problems== Including difficulties with paths, dependencies, shell setup and local installations (coming soon) ==Software problems== Coming soon ==MPI problems== An overview of some of the most challenging problems to identify and resolve (coming soon) f9d51691cb29b41bdcf6851a02a0946f54bd2dff 703 702 2022-05-06T11:39:31Z Mayaahorton 17 /* Software problems */ wikitext text/x-wiki ==Overview== High Performance Computing environments are often built on heterogeneous architecture, meaning that no two clusters are alike. Code that runs on one HPC system is not guaranteed to work on another. Even within the UHHPC, some machines are designed for certain tasks and have software and filesystems mounted that are not available from elsewhere in the cluster. When code doesn't run, it is tempting to think that the cluster is broken. More often, though, there is a coding problem rather than a hardware problem. The UHHPC admin team manages more than 400 users across approximately 20 different research groups and departments. We work with external partners at dozens of institutions worldwide. The cluster runs thousands of specialised research software, modules and packages. The UHHPC is a shared research facility and users are responsible for their own research and experimental design. Our primary goal is the maintenance and development of the physical infrastructure. We are a very small team and such cannot help optimise experiments or debug code. Except in rare circumstances, we cannot compile or recompile code for you (if you really need this, please talk to us first). Most errors are not caused by hardware malfunction (although this does happen occasionally, particularly after power outages). It can be very difficult to know whether a problem is caused by a cluster issue which needs to be reported to the helpdesk, or a code issue which you could solve yourself. The following sections are designed to give a quick look at the four main classes of scripting error and some steps you can take to try to resolve them yourself. (Sections coming soon) ==Job scheduling problems== UHHPC runs on a PBS-based job scheduler called Torque and Maui. You can submit jobs interactively or through a submission script. A list of common scheduler errors is given below (coming soon), including resource request errors. ==Local environment problems== Including difficulties with paths, dependencies, shell setup and local installations (coming soon) ==Software problems== Including issues with compiling. Will also cover what to check when your code is running much more slowly than expected. Coming soon ==MPI problems== An overview of some of the most challenging problems to identify and resolve (coming soon) 0bb4191536fcdf48c11a2bfb699d8fda36251d0b Quota 0 38 706 540 2022-07-21T19:14:38Z Mjh 2 wikitext text/x-wiki Use of space on <tt>/home</tt> and <tt>/home2</tt> is subject to a quota system. You have a fixed amount of space that you are allowed to use: if you exceed this, you will no longer be able to create files in your home directory.. The current default quota for all users is 50 Gb. When you reach 49 Gb, you will be warned and given a period (1 week) in which your usage should be reduced below 49 Gb; if you fail to reduce usage in this period, or if your usage reaches 50 Gb, new file creation will be blocked. The quota is ''not'' an expected reasonable use for a cluster user. You should try to keep your use of the home directory as low as possible. If you believe you have a specific need to exceed this quota, please contact the cluster [[administrators]]. There is no quota on the various data areas (see [[Storage]]) and these are the locations where it is appropriate to store large volumes of data. 2be2b9e6aeadbb0cabdc20450a8a18d2c2e028ea Architecture 0 7 708 654 2022-07-22T08:49:25Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[GPUs|GPU]] machines, gpu1-4, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A file server, stri-server, which is a 2 socket x 8 core Xeon machine ** 473 TB of [[storage]] attached via Fibre Channel to this server. * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * metadata and dstorage1-12, file servers providing 1.1 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg 47a9ed2eca574239d7e0b5774b630ee70af7300b 709 708 2022-07-22T08:50:08Z Mjh 2 /* Servers and dedicated login nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[GPUs|GPU]] machines, gpu1-4, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A cair data processing server - [[cair-cluster]], providing logins and storage to CAIR users ** 132 TB of [[storage]] attached via Fibre Channel to this server. * A separate server -- [[cair-forecast]] -- providing an additional 80 TB of storage and processing for CAIR AQF use. * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * metadata and dstorage1-22, file servers providing 3.9 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg b4e9d6544662e17cbfef154624e8e7313cb58152 720 709 2023-08-07T09:38:13Z Mjh 2 /* Servers and dedicated login nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[GPUs|GPU]] machines, gpu1-4, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * smp6, dedicated 96-core machine * metadata and dstorage1-23, file servers providing 4.2 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. The image below shows the physical layout of the cluster components as at late 2011. The cluster is no longer in this configuration. http://stri-cluster.herts.ac.uk/cluster2.jpg e24b2eb687cdddb07bee12f3e03fe51111f7f512 721 720 2023-08-07T09:38:29Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * 12 Xeons (E5-2660s) 2 socket x 8-core with 64 GB RAM and FDR Infiniband (node129-140: chassis 9), in the main queue * 4 Xeons (E5-2660s) 2 socket x 8-core with 16 GB RAM and FDR Infiniband are dedicated AQF (CAIR) nodes (node141-144: chassis 9), in the forecast queue * Four [[GPUs|GPU]] machines, gpu1-4, in the gpu queue == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * smp6, dedicated 96-core machine * metadata and dstorage1-23, file servers providing 4.2 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. 56d67c07e9487f812b5b7c9df462f3a01729d213 724 721 2023-09-26T10:31:23Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * 16 Xeons (E5-2660s) 2 socket x 8-core with varying amounts of RAM and FDR Infiniband (node129-140: chassis 9), in the main queue * Four [[GPUs|GPU]] machines, gpu1-4, in the gpu queue * Coming soon, 8 new GPU machines (gpu5-12) with 4xA100 GPUs. * Coming soon, 8 new 96-core nodes == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * smp6-8, 96-core 1-TB RAM systems * metadata and dstorage1-23, file servers providing 4.2 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. 0518003e35f0f26a7826a5219fccdc4e28d9bee0 CASA 0 28 711 334 2023-01-05T14:08:16Z Mayaahorton 17 wikitext text/x-wiki CASA is software for radio astronomy data reduction. Various versions are installed on the cluster. The latest version is always in <tt>/soft/casapy</tt> (a symbolic link to the real directory). To use casa, do <tt>module load casa</tt> and then run it with <tt>casapy</tt> or <tt>casa</tt>. You should not run CASA on the head node: either run it through the batch job system or use an [[interactive jobs|interactive job]]. 062acbb81b53be9ff93f5e9fb0c8052355c2ca46 Singularity 0 79 712 603 2023-01-05T16:52:27Z Mjh 2 wikitext text/x-wiki Singularity ([https://github.com/sylabs/singularity]) is installed on the cluster. This is our preferred way of running containerized applications, including Docker containers. You need to use the singularity [[modules|module]] (<tt>module load singularity</tt>) to get singularity on your path. You probably want to use the --bind option to bind data directories such as beegfs. Note that singularity images can't be built on BeeGFS (they can be stored there once built). This will affect users converting from Docker images. If this causes you problems please contact the [[administrators]]. ad1bff97711879de8cff4a4e79556b2bb0fb2087 Fair share 0 39 713 553 2023-02-09T16:09:12Z Mjh 2 wikitext text/x-wiki There are various systems in place to try to make sure that everyone gets a fair share of the cluster (particularly the main cluster) and that a mixture of long and short, large and small jobs can be run. Jobs are run based on a priority assigned by the scheduler. The priority takes account of a number of factors: * Jobs that involve more CPUs intrinsically have higher priority. This is to stop single-CPU work pushing out large jobs. * Jobs that have been in the queue for longer have higher priority, up to a maximum of 64 idle jobs per user (after that, the jobs will wait in the queue but will not accrue priority) * Jobs from users who have not recently significantly used the cluster have priority over those from users who have. (This is called the fair-share mechanism: the 'fair share amount' is a weighted integral of usage over the last week, weighted towards more recent use.) In addition, by default, * no user can have more than 512 processors' worth of jobs running at once. This is intended to stop a single user monopolizing the cluster * no user can have a processor-time product that exceeds 1 week x 256 cores running at any given time. This is intended to stop large long jobs blocking shorter jobs. These policies apply to all queues but are tailored to the main queue where the competition is highest. If they cause you problems, please contact the [[administrators]]. We are happy to review policies to try to get the fairest result for everyone, and we can relax the default requirements if you have a particular need for more resources. 0560d02a90354e79bb8f494043b2db43fcc9a4af Gromacs 0 19 714 558 2023-04-29T09:47:56Z Mjh 2 wikitext text/x-wiki [http://www.gromacs.org Gromacs] is a software for molecular dynamics simulation. It treats molecules as particles in a classical mechanics forcefield. There are a number of versions of Gromacs on the cluster. Gromacs 2018 and 2023 are the most recent ones. == '''How to perform a simulation with Gromacs' mdrun:''' == 1) You need to prepare the binary simulation start file (tpr-file) either on your local LINUX machine or on the headnode of the cluster. If you prepare it on the local machine, make sure you use Gromacs version corresponding to the one you want to use on the cluster In order to run Gromacs on the headnode for preparation, it is a good idea to put the following into you .cshrc file: * For 2023 version source /soft/gromacs-2023.1/bin/GMXRC (either CPU or GPU version) export LD_LIBRARY_PATH="/soft/gcc-10.4.0/lib64:${LD_LIBRARY_PATH}" * For 2018 version source /soft/gromacs-2018/bin/GMXRC (or /soft/gromacs-2018-gpu/bin/GMXRC for GPU preparation) export LD_LIBRARY_PATH="/soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="/soft/gcc-6.4/lib64:${LD_LIBRARY_PATH}" 2) Prepare a c-shell script (e.g. runjob.sh) as shown in the example below. [[Jobs|More info.]] 3) Make the script executable: chmod +x runjob.sh 4) Submit the job to the cluster: qsub runjob.sh The script below needs to be adjusted according to your needs, i.e. the user name (#PBS -u), the maximum time a job should run (#PBS -l walltime=<hours>:<min>:<sec>), the working directory and the details of the mdrun command. Look here for [[groperform|optimising performance]]. * There are two version of Gromacs 2018.2 for GPU and non-GPU located in /soft/gromacs-2018 and /soft/gromacs-2018-gpu Note that all GPUs attached to the node are used automatically. The maximum walltime is 48 hours. * For Gromacs 2023.1 both CPU and GPU versions are located in /soft/gromacs-2023.1. Use gmx for GPUs and 32-core compute nodes. Use gmx_mpi for 64 and 96-core compute nodes. [http://www.phy.bme.hu/~cluster/docs/PBS.html A good tutorial for using PBS] ''Andreas/Hershna'' '''For GPU:''' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q gpu #PBS -l nodes=1:ppn=16 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -k oe #PBS -u hpatel # runs a job with name 'GromacsTest' on the gpu machine on the cluster # uses 1 GPU machine # set a maximum time of forty eight hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'hpatel' # set required paths: source /soft/gromacs-2018-gpu/bin/GMXRC # specify working directory: cd /home/hpatel/gromacsGPU export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="/soft/gcc-6.4/lib64:${LD_LIBRARY_PATH}" ### This is the command ### gmx mdrun -s run.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> -------------- For non-GPU use, Gromacs is optimised for the newer nodes that contain 32 cores. In order to make sure that the job runs on these nodes, you have to request them with #PBS -l nodes=1:ppn=32. An example of a job script is shown below: '''Without use of GPU:''' -------------- <pre>#!/bin/sh #PBS -N GromacsTest #PBS -q main #PBS -l nodes=1:ppn=32 #PBS -l walltime=48:00:00 #PBS -j oe #PBS -k oe #PBS -u hpatel # runs a job with name 'GromacsTest' on the main cluster # set a maximum time of forty eight hours (walltime) # merge 'output' and 'standard error' and output both to 'standard output' (-j oe) # produced the output, while the job is running (-k oe) # specifies user 'hpatel' # set required paths: source /soft/gromacs-2018/bin/GMXRC # specify working directory: cd /home/hpatel/gromacs export LD_LIBRARY_PATH="soft/mpi/mvapich2-1.6/lib:${LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="/soft/gcc-6.4/lib64:${LD_LIBRARY_PATH}" ### This is the command ### gmx mdrun -s run.tpr -c after_md.gro -v -stepout 1000 ### command end ### # start with 'qsub runjob.sh' </pre> 24d16e8af043ffb749debed6605e710b64c30379 Compilers 0 16 717 53 2023-04-29T09:58:09Z Mjh 2 wikitext text/x-wiki The standard C and Fortran compilers are <tt>gcc</tt> and <tt>gfortran</tt>. The Intel compilers <tt>icc</tt> and <tt>ifort</tt> are also available and may be superior for many applications. By default the gcc and Fortran versions are 4.8.5. These are very old and may not work with modern software. To access later versions look at <tt>module avail</tt>. Currently gcc-6.4, gcc-10.4 and gcc13.1 are available. To run software built with these you will need to have the relevant libraries on your LD_LIBRARY_PATH unless you have explicitly loaded the module. E.g. in a job script you might do <tt>setenv LD_LIBRARY_PATH /soft/gcc-10.4.0/lib64</tt> (tcsh) or <tt>export LD_LIBRARY_PATH=/soft/gcc-10.4.0/lib64</tt>. The same requirements apply to the Intel compilers accessible with <tt>module load intel</tt>, where <tt>/soft/intel/lib/intel64_lin</tt> needs to be on your LD_LIBRARY_PATH. 2ba2356559b064646adbb51c1e4df037c5f1a7bf 719 717 2023-04-29T09:59:31Z Mjh 2 wikitext text/x-wiki The standard C and Fortran compilers are <tt>gcc</tt> and <tt>gfortran</tt>. The Intel compilers <tt>icc</tt> and <tt>ifort</tt> are also available and may be superior for many applications. By default the gcc and Fortran versions are 4.8.5. These are very old and may not work with modern software. To access later versions look at <tt>module avail</tt>. Currently gcc-6.4, gcc-10.4 and gcc13.1 are available. To run software built with these you will need to have the relevant libraries on your LD_LIBRARY_PATH unless you have explicitly loaded the [[Modules|module]]. E.g. in a job script you might do <tt>setenv LD_LIBRARY_PATH /soft/gcc-10.4.0/lib64</tt> (tcsh) or <tt>export LD_LIBRARY_PATH=/soft/gcc-10.4.0/lib64</tt>. The same requirements apply to the Intel compilers accessible with <tt>module load intel</tt>, where <tt>/soft/intel/lib/intel64_lin</tt> needs to be on your LD_LIBRARY_PATH. 05f25884e67e2fa6974798cce0ed94e6972ac49b Modules 0 33 718 653 2023-04-29T09:58:52Z Mjh 2 wikitext text/x-wiki The cluster uses the <tt>environment-modules</tt> package to manage environment variable settings for software packages. When a 'module' is loaded, the appropriate environment settings are added to the start of your PATH, MANPATH, LD_LIBRARY_PATH etc, and other variables used by the relevant package may be set as well. When a module is unloaded, the changes made to environment variables are undone. Documentation of this package is available at this link[http://modules.sourceforge.net/], or type <tt>man module</tt> Basic commands include: * <tt>module list</tt>. See what modules you have loaded. * <tt>module avail</tt>. List what modules are available to you. * <tt>module load [modulename]</tt>. Load a module, e.g. <tt>module load mpich2-local</tt> * <tt>module unload [modulename]</tt>. Unload a module. * <tt>module show [modulename]</tt>. Show what loading a module does. You may use <tt>module</tt> commands in your .bashrc or .cshrc, e.g. to select your preferred [[MPI]] environment. Module commands do not work in job scripts or scripts run by jobs because the relevant aliases are only set up by login shells. This means to get the effect of loading a module you should either manually set environment variables as described in <tt>module show</tt> or do <pre> eval `/usr/bin/modulecmd [shell] load [module]` </pre> where <tt>[shell]</tt> is the name of the shell you are using. We are happy to add other environments as modules -- please contact the cluster [[Administrators]]. 34068dddaf4ddd34b3fb94c198c0031eb6928fd5 GPUs 0 71 722 663 2023-09-04T07:36:15Z Mjh 2 wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1: The attached GPUs are 6 Tesla K80 units with 16GB VRAM (this machine is out of service) * gpu2 and gpu3: These both have 3 Tesla V100 units with 16 GB VRAM each on gpu3 and 32 GB VRAM on gpu2. * gpu4: This has one Tesla V100S and two V100s, with a mixture of 16 GB and 32 GB. * ramius has a single Tesla K40c. ramius is a private machine, the other machines are accessible through the gpu queue. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. You can do e.g. <tt>module load cuda-10.0</tt> to make sure the libraries and compilers are on your path. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Tensorflow == Tensorflow packages are available under python3.6. Make sure CUDA 10.0 is loaded (see above) and <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt> is on your PYTHONPATH e.g. by doing <tt>module load python3 cuda-10.0</tt>. If you are running on a GPU machine you will then get GPU acceleration in Tensorflow. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. e780fa7456fecf3f23b5e0f4ce8d705cb45a753a Policies 0 4 723 505 2023-09-21T11:04:39Z Mjh 2 wikitext text/x-wiki The cluster is by design a shared resource. In using it you must be considerate of other users. Some detailed guidelines are as follows: * Accounts are for use by the named user only. You must not allow anyone else to use your account. * The [[architecture|head node]]s must '''never''' be used for computation of any kind on a larger scale than a few minutes' testing or processing of results from the compute nodes. * The only permitted method of using the compute nodes is by way of the [[jobs|batch queuing system]]. You '''must not''' log in to the compute nodes directly using ssh. If you need an interactive session on a compute node, e.g. for data processing, use the [[interactive jobs]] facility. * When using the batch queuing system you '''must''' honour the allocations of nodes that your job is given. * Please use the [[storage]] with consideration to others. We reserve the right in extreme situations to tidy up after you. * There is a [[fair share]] policy which has been tuned to a certain extent and is operated automatically by the scheduler. Essentially this means that you may find that other people's jobs overtake yours in the queue in an attempt to distribute resources fairly. The intention is that this system will operate only on a timescale of days, though. We do not presently guarantee fairness on timescale of minutes to hours, because to do that we would have to be able to pre-empt jobs that are already running. Please be aware of these limitations of the fair-share policy and work within them. 24003d633fdd76f79afe9e61adbfd18b7695b338 User:Jmcgarry 2 90 726 2023-09-26T10:33:16Z Mjh 2 Creating user page for new user. wikitext text/x-wiki Cluster Manager / Radio Astronomy PhD student b621d283c4e2c20e45e38f4201bf916be908a55e User talk:Jmcgarry 3 91 727 2023-09-26T10:33:16Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 10:33, 26 September 2023 (UTC) 41b5f93d0ce19e5df3b28e0d32e062d22a900503 Terms of use 0 77 728 549 2023-10-03T15:24:15Z Mjh 2 wikitext text/x-wiki Use of the UHHPC facility is subject to terms and conditions. To apply for or continue to hold an account you must explicitly agree to these conditions. * Access to UHHPC is available to three classes of people: *# Members of the University of Hertfordshire (UH), including undergraduate students, who have a legitimate need to use the cluster facilities. Undergraduate students must apply through an academic supervisor. *# External collaborators of UH research staff, for work on projects that will directly benefit UH. *# Members of external consortia who have negotiated an agreement with UH (e.g. [[WEAVE]], [[LOFAR-UK Compute Facility|LOFAR-UK]]). * Access is conditional on abiding by the [[policies]] that govern use of the cluster. Access may be withdrawn at the discretion of the [[administrators]] in extreme cases. * Access is given to individuals. Account credentials may not be shared. An individual is solely responsible for the use made of their account. * UH will store the full name and contact e-mail address of all users. A valid, monitored e-mail address must be on record for all users. Users must notify the [[administrators]] of any change in their contact details. Your details will be stored securely on the cluster itself and will be used both for a cluster mailing list and to contact you in case of problems with your use of the cluster. * The administrators may take whatever actions they feel necessary for troubleshooting or to ensure the smooth operation and security of the facility, which may include inspecting any data or programs stored on the cluster. * UH takes no responsibility for backing up users' data. Users store data on UHHPC at their own risk. * UH makes no guarantee about the level of service provided at any given time. * Access is provided only while it is needed. Users must notify the [[administrators]] when they no longer need access. Accounts will be deleted according to the [[Account cancellation policy]]. In particular if your registered e-mail address becomes invalid and we cannot contact you we will assume that your account can be deleted. 15326a895c2449a18417b94e5a1fdd81a179bf2e Star-CCM+ 0 50 729 359 2023-10-11T08:46:48Z Jmcgarry 18 wikitext text/x-wiki Star-CCM+ is an engineering package which can be used to solve CFD problems. This [http://{{SERVERNAME}}/docs/starccm.pdf guide] (kindly written by Vitaly Voloshin) shows you how to use Star-CCM+ on the cluster. Please note that this guide is now fairly old and may not fully reflect the best approach for the current set up. STAR-CCM+ on the cluster gets its licences from zaxx.stca.herts.ac.uk which is not part of the HPC system and not in the control of the HPC team. Please make sure to properly set your CDLMD_LICENSE_FILE environment variable to access the licence server. This can be done by setting up a .flexlmrc file which contains the line: <pre> CDLMD_LICENSE_FILE=1999@zaxx.stca.herts.ac.uk </pre> Alternatively, tcsh users can set their environment using: <pre> setenv CDLMD_LICENSE_FILE 1999@zaxx.stca.herts.ac.uk </pre> The following files are those listed in the guide: *[http://{{SERVERNAME}}/docs/queue_set.sh queue_set.sh] *[http://{{SERVERNAME}}/docs/starccm_start.sh starccm_start.sh] *[http://{{SERVERNAME}}/docs/run.java run.java] *[http://{{SERVERNAME}}/docs/surf_mesh.java surf_mesh.java] *[http://{{SERVERNAME}}/docs/sv_mesh.java sv_mesh.java] *[http://{{SERVERNAME}}/docs/vol_mesh.java vol_mesh.java] ef3a19942afdd67cf9f116dbfa0ec8ee819fa55c ORCA 0 92 730 2024-02-07T17:24:05Z Jmcgarry 18 Created page with "ORCA is an ab initio quantum chemistry program package for modern electronic structure methods including density functional theory, many-body perturbation, coupled cluster, mu..." wikitext text/x-wiki ORCA is an ab initio quantum chemistry program package for modern electronic structure methods including density functional theory, many-body perturbation, coupled cluster, multireference methods, and semi-empirical quantum chemistry methods. Its main field of application is larger molecules, transition metal complexes, and their spectroscopic properties. =Accessing ORCA= ORCA is free to use in academic institutions, but individual users are required to make an account on the ORCA forums. If you wish to access ORCA on UHHPC you must send '''confirmation of an activated ORCA forum account''' to the [[administrators]] before your account can be added to the ORCA user group. =Running ORCA= Once you are a member of the ORCA group, you can gain access to ORCA by adding the following path to your environment: <pre> setenv PATH /soft/orca_5_0_4/:$PATH </pre> To run calculations in serial you can then call <tt>orca</tt> to start the software. If you want to run calculations in parallel, you will also need to add paths that set up OpenMPI: <pre> setenv PATH /soft/openmpi-4.1.1/bin/:$PATH setenv LD_LIBRARY_PATH /soft/openmpi-4.1.1/lib/:$LD_LIBRARY_PATH </pre> Additionally, to actually run the software in parallel you will need to call the full path to ORCA: <pre> /soft/orca_5_0_4/orca </pre> The start of your input file will also need to be edited to allow the software to run in parallel. For ppn≤8 the first line can be edited to include the PAL keyword, e.g. for ppn=4: <pre> !HF DEF2-SVP PAL4 </pre> For ppn>8 a new line will need to be added, e.g. for ppn=16: <pre> !HF DEF2-SVP %PAL NPROCS 16 END </pre> =Example job= To test that you can correctly access and run the ORCA you can try running a "hello water" job. First you will need to create a plain text input file, water.inp: <pre> !HF DEF2-SVP %PAL NPROCS 16 END * xyz 0 1 O 0.0000 0.0000 0.0626 H -0.7920 0.0000 -0.4973 H 0.7920 0.0000 -0.4973 * </pre> Then create a job script for submission, orcajob.sh: <pre> #!/bin/csh #PBS -N hello-water #PBS -l nodes=1:ppn=16 #PBS -l walltime=00:10:00 #PBS -q main #PBS -m abe #PBS -u username #PBS -W group_list=orca # Set the ORCA and OpenMPI paths setenv PATH /soft/orca_5_0_4/:$PATH setenv PATH /soft/openmpi-4.1.1/bin/:$PATH setenv LD_LIBRARY_PATH /soft/openmpi-4.1.1/lib/:$LD_LIBRARY_PATH # Move to working directory cd /path/to/working/directory # Command /soft/orca_5_0_4/orca water.inp > water.out </pre> This example will run the calculation in parallel across 16 cores. To submit the script simply pass the follwing command to the terminal on the headnode: <pre> qsub orcajob.sh </pre> 3b3cb27c4f93ec3d0c5c12a2c945ee66cb038d98 Software 0 17 731 716 2024-02-07T17:33:10Z Jmcgarry 18 /* Molecular dynamics */ wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt>, <tt>gcc 10.4</tt> or <tt>gcc 13.1</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 2 and 3 and many [[Python packages]] * [[Matlab]]: in <tt>/soft/MATLAB/R2022b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> * [[DS9]]: in /soft/bin/ds9 = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA</tt> * [[Gromacs]]: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * [[ORCA]]: in <tt>/soft/orca_5_0_4</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> or <tt>module load singularity...</tt> 7043ee9fbd174f2582cfa75d269c715112580ec4 762 731 2024-09-20T11:21:45Z Mjh 2 wikitext text/x-wiki This page documents the software installed on the cluster and its location. Detailed local documentation of software, if available, goes in a page specific to that software linked from this list. Users are encouraged to update the wiki with descriptions of the software they use. Contact the [[administrators]] if you need an upgraded version of any of these. If you need any freely available software package that runs on Linux, in general we can install it. We do not have resources to pay for licences for commercial software, though, so if you need a commercial software package to be installed you will also need to provide funding for a licence. = Programming languages and development environments = * GNU C, C++ and Fortran: available by default, <tt>module load gcc-6.4</tt>, <tt>gcc 10.4</tt> or <tt>gcc 13.1</tt> for newer versions * Intel C and Fortran: <tt>module load intel</tt> * Python 3.9 and many [[Python packages]]. [[Jupyter notebooks]] are available * [[Matlab]]: in <tt>/soft/MATLAB/R2022b/bin/matlab</tt> * [[IDL]]: in <tt>/soft/idl/idl/bin</tt> * [[R]]: installed by default or <tt>/soft/R</tt> * [[Julia]]: <tt>module load julia</tt> * [[GPUs|CUDA]]: <tt>module load cuda...</tt> = Astronomical software = * Astropy: see [[Python packages]] * [[AIPS]]: 31DEC18 installed in <tt>/soft/aips</tt> * [[CASA]]: installed in <tt>/soft/casa...</tt> * [[Starlink]]: in <tt>/soft/star</tt> or <tt>/soft/stardev</tt> * [[LOFAR]]: in <tt>/soft/lofar-xxxxxx</tt>, see page for details * [[Miriad]]: in <tt> /soft/miriad</tt> * [[ciao]]: in <tt>/soft/ciao-x.x</tt> * [[SAS]]: in <tt>/soft/xmmsas_xxxxxx</tt> * [[PLUTO]]: see page for documentation * [[aoflagger]]: in <tt>/soft/aoflagger</tt> * [[wsclean]]: in <tt>/soft/wsclean</tt> * [[Brats]]: in <tt>/soft/brats</tt> * [[Topcat]] and Stilts: in <tt>/soft/topcat</tt> * [[DS9]]: in /soft/bin/ds9 = Engineering = * [[Ansys Fluent]]: in <tt>/soft/ansys_inc</tt> * [[Star-CCM+]]: in <tt>/soft/STAR-CCM+14.02.010</tt> * [[Cantera]]: in <tt>/soft/cantera</tt> * [[Converge]]: in <tt>/soft/CONVERGE_Studio/</tt> = Molecular dynamics = * [[NAMD]]: in <tt>/soft/NAMD_2.13_Linux-x86_64-ibverbs-smp-CUDA</tt> * [[Gromacs]]: in <tt>/soft/gromacs-2018-gpu</tt> * Autodock [[Vina]]: in <tt>/soft/autodock_vina_1_1_1_linux_x86</tt> * [[Autodock]] : in <tt>/soft/autodock</tt> * [[iGemDock]]: in <tt>/soft/iGEMDOCKv2.1-centos/</tt> * [[ORCA]]: in <tt>/soft/orca_5_0_4</tt> = Computational neuroscience = * [[neuron]]: in <tt>/soft/nrn</tt> = Optimization = * [[Gurobi]]: in <tt>/soft/gurobi</tt> = Containerization = * [[Singularity]] in <tt>/soft/bin/singularity</tt> or <tt>module load singularity...</tt> 3be672e247d1b0f4bc35442b3814ce1d94fb8de7 SMP machines 0 24 732 461 2024-02-28T19:23:37Z Mjh 2 wikitext text/x-wiki The SMP machines are: * smp4: 32-core machine with 256 GB RAM normally reserved for LOFAR-UK use * smp5: 48-core machine with 1.5 TB RAM * smp6-8: 96-core machines with 1 TB RAM. These machines are accessed through the smp queue but you should ask the [[administrators]] for access as they are usually reserved for particular applications. f8564560f5abb28f52553184e913e28be12c376c GPUs 0 71 733 722 2024-02-28T19:25:18Z Mjh 2 wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu1: The attached GPUs are 6 Tesla K80 units with 16GB VRAM (this machine is out of service) * gpu2 and gpu3: These both have 3 Tesla V100 units with 16 GB VRAM each on gpu3 and 32 GB VRAM on gpu2. * gpu4: This has one Tesla V100S and two V100s, with a mixture of 16 GB and 32 GB. * gpu5 and gpu6: These each have 4 Tesla A100s with 80 GB RAM. * ramius has a single Tesla K40c. * dgx1 has 8 Tesla A100s. ramius and dgx1 are private machines, the other machines are accessible through the gpu queue. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. You can do e.g. <tt>module load cuda-10.0</tt> to make sure the libraries and compilers are on your path. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Tensorflow == Tensorflow packages are available under python3.6. Make sure CUDA 10.0 is loaded (see above) and <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt> is on your PYTHONPATH e.g. by doing <tt>module load python3 cuda-10.0</tt>. If you are running on a GPU machine you will then get GPU acceleration in Tensorflow. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. 0eed45cce73a5afb6e963d267c0cc06871517ed3 761 733 2024-09-18T08:39:31Z Mjh 2 wikitext text/x-wiki Several machines on the cluster have attached NVIDIA GPUs. * gpu2 and gpu3: These both have 3 Tesla V100 units with 16 GB VRAM each on gpu3 and 32 GB VRAM on gpu2. * gpu4: This has one Tesla V100S and two V100s, with a mixture of 16 GB and 32 GB. * gpu5-8: These each have 4 Tesla A100s with 80 GB RAM. * ramius has a single Tesla K40c. * dgx1 has 8 Tesla A100s. ramius and dgx1 are private machines, the other machines are accessible through the gpu queue. The NVIDIA CUDA software is installed in '/soft/cuda-VERSION' where VERSION is the version number of CUDA. You can do e.g. <tt>module load cuda-10.0</tt> to make sure the libraries and compilers are on your path. Note: * At the moment, there is no provision for preventing contention between users of the Tesla unit. That is, if more than one user runs a job that tries to use it at one time, there may be contention and we don't know what effect this will have. * Equally, there is no provision for preventing contention on the host side -- if you only ask for a few CPU cores and another CPU-bound job takes the remainder, this may (or may not) interfere with your job. The easiest way to overcome any issues of contention would be to ask for all cores of the host machine, even if you're not going to use them all: then the job submission system will not allow any other jobs to run on the host, so you have exclusive access to the GPU. The current arrangement in principle allows non-GPU jobs running on a machine with a GPU to block CUDA jobs. If this becomes a problem, we will review the queueing arrangements. == Tensorflow == Tensorflow packages are available under python3.6. Make sure CUDA 10.0 is loaded (see above) and <tt>/soft/python3/usr/local/lib64/python3.6/site-packages</tt> is on your PYTHONPATH e.g. by doing <tt>module load python3 cuda-10.0</tt>. If you are running on a GPU machine you will then get GPU acceleration in Tensorflow. == Via OpenGL context == It is possible for an application to access an OpenGL context on the GPUs using the provided 'headless' X server configuration. * User needs to start X server: <pre> X :42 & </pre> where "42" is a free display number, assuming "42" is free. If a number is chosen that is not free, an error message will appear. * Set the DISPLAY environment variable: <pre> export DISPLAY=:42.0 </pre> where "42" is the free display number as used for starting the X server, assuming the syntax of the BASH shell and .0 denotes the first screen (there are currently four - but using more than one is untested). * start application, which will need to request an OpenGL context, make use of it, output results unattended and quit after it is done (a render job for example). Submission of such jobs through the job queue is also untested, so needs to be discussed with the administrators first. 1323d6ff3143c089f1287d335d7a29537d152a3b Architecture 0 7 734 724 2024-02-28T19:26:44Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * Six [[GPUs|GPU]] machines, gpu1-6, in the gpu queue * Five [[smp machines]], smp4-8 with varying specifications * Coming soon, 6 new GPU machines (gpu7-12) with 4xA100 GPUs. * Coming soon, 8 new 96-core nodes == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * smp6-8, 96-core 1-TB RAM systems * metadata and dstorage1-30, file servers providing 4.2 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. b916965863e435fc4c22f5068eddaa5e0075a10b 735 734 2024-02-28T19:27:03Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * Six [[GPUs|GPU]] machines, gpu1-6, in the gpu queue * Five [[smp]] machines, smp4-8 with varying specifications * Coming soon, 6 new GPU machines (gpu7-12) with 4xA100 GPUs. * Coming soon, 8 new 96-core nodes == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * smp6-8, 96-core 1-TB RAM systems * metadata and dstorage1-30, file servers providing 4.2 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. 044cf5ea50472355f587d202d70fafea485d6ec1 736 735 2024-02-28T19:27:29Z Mjh 2 /* compute nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue. * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * Six [[GPUs|GPU]] machines, gpu1-6, in the gpu queue * Five [[SMP machines]], smp4-8 with varying specifications * Coming soon, 6 new GPU machines (gpu7-12) with 4xA100 GPUs. * Coming soon, 8 new 96-core nodes == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * smp6-8, 96-core 1-TB RAM systems * metadata and dstorage1-30, file servers providing 4.2 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. 73e4f98f24df3c0534f711f2df6127e8fe3065ce 753 736 2024-06-13T13:04:38Z Jmcgarry 18 /* compute nodes */ wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the main queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the large queue * 16 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the test queue * 7 AMD EPYC 7643 2 socket x 48-core (no hyperthreading) with 256 GB RAM and EDR Infiniband (node113-119: rack9) in the rocky-test queue * Six [[GPUs|GPU]] machines, gpu1-6, in the gpu queue * Five [[SMP machines]], smp4-8 with varying specifications * Coming soon, 6 new GPU machines (gpu7-12) with 4xA100 GPUs. == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * smp4 and smp5, dedicated 32-core and 48-core machines * smp6-8, 96-core 1-TB RAM systems * metadata and dstorage1-30, file servers providing 4.2 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. 11475cc13288dce3f9d2cb59239340a353e7d9cc 760 753 2024-09-18T08:38:53Z Mjh 2 wikitext text/x-wiki The cluster consists of == head/login nodes == * 3 head nodes: headnode1, headnode2, headnode3 each with 192 GB RAM and 2 x 12-core Xeon processors, for user login and development * job/queue server, uhhpc, with 12 cores and 32 GB RAM: web, jobs, mail, DNS etc == compute nodes == * 80 Xeons (Gold 6130) 2 socket x 16 core (no hyperthreading) with 192 GB RAM and FDR Infiniband (node001-080: rack1, rack2, rack3), in the core32 queue * 16 AMD EPYC 7452 2 socket x 32-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node097-112: rack4) in the core64 queue * 24 AMD EPYC 7552 2 socket x 48-core (no hyperthreading) with 256 GB RAM and FDR Infiniband (node081-096: rack5) in the core96 queue * 8 [[GPUs|GPU]] machines, gpu1-8, in the gpu queue * 5 [[SMP machines]], smp4-8 with varying specifications, some with >~ 1 TB RAM * Coming soon, 4 new GPU machines (gpu9-12) with 4xA100 GPUs. == GPUs == * See the separate page on [[GPUs]] == Servers and dedicated login nodes == * A separate server -- [[lofar-server]] -- providing logins and 100 TB of storage for [[LOFAR-UK Compute Facility|LOFAR-UK]] users * ramius, a dedicated 16-core machine * mancuso, a dedicated 20-core machine * metadata and dstorage1-30, file servers providing 6 PB of BeegFS distributed [[storage]] == Networking == * Ethernet and infiniband switches provide connectivity. 135867f7304dee4633dd5f8b3a385777596774d0 Networking 0 10 737 545 2024-02-28T19:29:13Z Mjh 2 wikitext text/x-wiki The nodes are linked by Gigabit ethernet and Infiniband networks. The ethernet network is a fairly conventional one; each chassis (see [[architecture]]) has an internal ethernet switch to which all nodes in that chassis are connected. These switches, together with the head node, are connected together via a main ethernet switch. The links connecting the head node and the SMP machines, and the uplinks from the two node internal ethernet stacks, are carried by multiple physical cables via link aggregation. A seperate physical ethernet network carries management traffic. The infiniband network is slightly more complex. Each rack (see [[Architecture]] has an internal infiniband switch and these are all linked via a main switch. Data rates are up to 200 GB/s. The networking arrangements should be taken into consideration when running jobs where fast communication is important -- latency is lower and data transfer rates are somewhat higher between nodes in the same rack than between different racks, and ethernet connections are higher-latency and lower-bandwidth still. Best results will be obtained for IPC if jobs run in the same rack. The scheduler is aware of this and will try to ensure that jobs do not span more than one rack. None of the nodes are on the public Internet: they all have private IP addresses starting 192.168.... These addresses are not routable from anywhere else in the University. They can, however, all see the public Internet via IP masquerading through the head node through the ethernet network. (For this reason, it is not sensible to try to make high-volume accesses to the Internet from the nodes.) Each node has an IP address corresponding to its ethernet port (192.168.2.xxx where xxx is the node number) and one corresponding to the Infiniband port (192.168.3.xxx). Within the cluster, you may use nodexxx.data (e.g. node001.data) and nodexxx.infi (node001.infi) to refer to these two networks. High-volume traffic should use the Infiniband network. The SMP machines have addresses smp1.data, smp1.infi etc. aa58db20468843c25630437a56c768af4b87b834 Storage 0 8 738 707 2024-02-28T19:30:25Z Mjh 2 /* System-wide NFS storage */ wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use their home directories (on /home and /home2) for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username (unless you are a member of a special group with dedicated storage, see below) to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 6.5 Tb of user home directories, mounted as /home and /home2 * Software directory /soft In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 3.8 PB of beegfs storage nominally distributed as follows: * 486 TB: general use, under /beegfs/general * 1122 TB: CAR, under /beegfs/car * 500 TB: LOFAR-UK, under /beegfs/lofar * 671 TB: CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == Nodes have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home and /home2 are backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file from your home directory we ''may'' have a useful backup (ask straight away). 5bde074fd41627400e9a080c06860aae035a5eb3 754 738 2024-06-15T11:14:29Z Mjh 2 wikitext text/x-wiki The '''storage''' available on the cluster comes in three flavours. Please use all cluster storage space responsibly: when you no longer need it, delete the contents or transfer them elsewhere so that others can use the space. Large datasets will be being processed on the cluster. (See also [[policies]].) == Overview == Most cluster users will use their home directories (on /home and /home2) for small amounts of important data (no more than 50 GB) and /beegfs/general for large amounts of data. Your home directory is created automatically for you, but you will need to make a directory /beegfs/general/your_username (unless you are a member of a special group with dedicated storage, see below) to work on the /beegfs volume. For details of all the different areas available, read more below. == System-wide NFS storage == This is shared across all machines and served by single servers in each case. It can be slow if many users are trying to access it simultaneously. We are phasing out the use of this kind of storage for large volumes. Currently general-user NFS volumes are: * 6.5 Tb of user home directories, mounted as /home and /home2 * Software directory /soft In addition there are some special NFS volumes, which are mounted under /data and are only available to particular users or groups: * /data/lofar : for LOFAR-UK users * /data/jim : Jim Geach * /data/astroml : Machine learning group The /home disc is for files that are important and require long-term storage (source code, qsub scripts, etc). It should not be used for general data or for large quantities of output from jobs, for example. Large files should be stored on the relevant data disk (if you want to work relative to /home, use a symbolic link). A [[quota]] system is in place on /home. == Distributed file system == The cluster uses the [https://www.beegfs.io/content/ Beegfs] file system for distributed storage, mounted at /beegfs . Files stored here are distributed over a number of servers in a way that's transparent to the user, who sees one big file system. Because there are multiple servers, /beegfs scales much better under load than the NFS storage as long as it is not full. There is 5.7 PB of beegfs storage nominally distributed as follows: * general use, under /beegfs/general * CAR, under /beegfs/car * LOFAR-UK, under /beegfs/lofar * CACP, under /beegfs/cair Use of the relevant subdirectories means that you think you have permission to use those allocations. In practice there is no difference in where or how the data are stored. == Disks local to machines == Nodes have a small amount of [[local disk space]] accessible to jobs (but not mounted on the head node), and [[ramdisks]] are also available. == Backups == No data area on the cluster is currently backed up. You must take responsibility for your own backups. /home and /home2 are backed up by nightly rsync to another location on the cluster: that means if you delete a crucial file from your home directory we ''may'' have a useful backup (ask straight away). 7bcc61dc7bfe976b6beaf1ea579d362570a30841 Singularity 0 79 739 712 2024-02-29T10:02:53Z Mjh 2 wikitext text/x-wiki Singularity ([https://github.com/sylabs/singularity]) is installed on the cluster. This is our preferred way of running containerized applications, including Docker containers. You need to use the singularity [[modules|module]] (<tt>module load singularity</tt>) to get singularity on your path. You probably want to use the --bind option to bind data directories such as beegfs. Singularity images can't be default be built from recipes on the cluster because this requires root privileges. If you need this (via the 'fakeroot' functionality) please contact the sysadmins, or else build the containers on a machine where you have the appropriate privileges and push to a library or copy over. e694fb6218baa89eb63b1e5245f495c9401a7099 Dask 0 93 740 2024-03-08T15:05:53Z Jmcgarry 18 Created page with "under construction Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy..." wikitext text/x-wiki under construction Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. This example was run in a 3.12 environment. It will run a single 32-core node on the main queue with the default 1GB RAM per core and connect to the dashboard on your local machine. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to PBSCluster do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default walltime is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=32', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the dask distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running ipython on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use ps aux to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 14a601659aad3ce761b95a47e1bce5ae62cc5f53 741 740 2024-03-08T15:07:43Z Jmcgarry 18 wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. This example was run in a 3.12 environment. It will run a single 32-core node on the main queue with the default 1GB RAM per core and connect to the dashboard on your local machine. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to PBSCluster do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default walltime is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=32', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the dask distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running ipython on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use ps aux to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 755879a2c2413f415015628f483324129a945ec7 742 741 2024-03-08T15:11:16Z Jmcgarry 18 wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == It will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to PBSCluster do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default walltime is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=32', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running ipython on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use ps aux to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 9d05549e4c40e9886e0e6a35e85688c1a9d184da 743 742 2024-03-08T15:12:17Z Jmcgarry 18 /* Launching a Dask cluster on UHHPC */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == It will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to PBSCluster do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=32', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running ipython on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use ps aux to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> dcd2f22487a172c99fdf2adf78f11990b5161944 744 743 2024-03-08T15:12:53Z Jmcgarry 18 /* Launching a Dask cluster on UHHPC */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == It will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to PBSCluster do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running ipython on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use ps aux to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> a24dd1ca637c96e81e77cc0698a1a614921a0b74 745 744 2024-03-08T15:13:20Z Jmcgarry 18 /* Launching a Dask cluster on UHHPC */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to PBSCluster do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running ipython on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use ps aux to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 9cbf0dc52b6d236bce08932c46413bfb0d25c423 746 745 2024-03-08T15:15:38Z Jmcgarry 18 wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to <tt>PBSCluster</tt> do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running <tt>ipython</tt> on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use <tt>ps aux</tt> to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 979f9d3aa45294ce1efcb6117741e5247768a3e3 747 746 2024-03-08T15:20:28Z Jmcgarry 18 wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). For advice on using Dask, please see the [https://docs.dask.org/en/stable/index.html documentation] and [https://tutorial.dask.org/00_overview.html tutorials]. == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to <tt>PBSCluster</tt> do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running <tt>ipython</tt> on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use <tt>ps aux</tt> to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 7b8b96ce0526761c21a5be11c36dd9bf92951562 748 747 2024-03-08T15:31:31Z Jmcgarry 18 /* Launching a Dask cluster on UHHPC */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). For advice on using Dask, please see the [https://docs.dask.org/en/stable/index.html documentation] and [https://tutorial.dask.org/00_overview.html tutorials]. == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to <tt>PBSCluster</tt> do not reflect the resources you will get on your job, they are the maximum available. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running <tt>ipython</tt> on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use <tt>ps aux</tt> to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 9a49d8f8ed8d13d73d9da62b8bd1498bc171144c 749 748 2024-03-08T15:35:13Z Jmcgarry 18 /* Launching a Dask cluster on UHHPC */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). For advice on using Dask, please see the [https://docs.dask.org/en/stable/index.html documentation] and [https://tutorial.dask.org/00_overview.html tutorials]. == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to <tt>PBSCluster</tt> do not reflect the resources you will get on your job, they are set as the maximum available on the type of machine you are using. Check the [[architecture]] page to get details of the machines in each queue If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running <tt>ipython</tt> on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use <tt>ps aux</tt> to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 222e49a14db54d4139ef90b90b22ada8b1607637 750 749 2024-03-08T15:35:39Z Jmcgarry 18 /* Launching a Dask cluster on UHHPC */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). For advice on using Dask, please see the [https://docs.dask.org/en/stable/index.html documentation] and [https://tutorial.dask.org/00_overview.html tutorials]. == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to <tt>PBSCluster</tt> do not reflect the resources you will get on your job, they are set as the maximum available on the type of machine you are using. Check the [[architecture]] page to get details of the machines in each queue. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Note which headnode (1-3) you are running <tt>ipython</tt> on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use <tt>ps aux</tt> to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> 4724b1eaca59306203c9cf2d0d7ce18d3019795a 751 750 2024-03-10T15:01:00Z Jmcgarry 18 /* Connecting to a dashboard */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). For advice on using Dask, please see the [https://docs.dask.org/en/stable/index.html documentation] and [https://tutorial.dask.org/00_overview.html tutorials]. == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to <tt>PBSCluster</tt> do not reflect the resources you will get on your job, they are set as the maximum available on the type of machine you are using. Check the [[architecture]] page to get details of the machines in each queue. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Open a terminal (UNIX) or PowerShell (Windows). Note which headnode (1-3) you are running <tt>ipython</tt> on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use <tt>ps aux</tt> on a UNIX system to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> Alternatively, from Windows PowerShell you can use <tt>ps</tt>. <pre> ps | findstr -i ssh stop-process -id <PID> </pre> 3b04959e84889c6096a06cfa1d08dff34a96a708 752 751 2024-03-10T15:01:34Z Jmcgarry 18 /* Connecting to a dashboard */ wikitext text/x-wiki Dask is a python package that allows dynamic task scheduling as well as parallelisation of tasks (e.g. for-loop codes), DataFrames (pandas), Arrays (numpy), and Bags (Python lists). For advice on using Dask, please see the [https://docs.dask.org/en/stable/index.html documentation] and [https://tutorial.dask.org/00_overview.html tutorials]. == Setting up Dask == First you will need a python installation. It is probably best to install miniconda to run your own environment. In your python environment, install packages using pip3: * dask * dask distributed * bokeh (this is necessary to run a dashboard) * ipython (make sure this is up-to-date in your environment) You may also wish to install: * numpy * pandas * scikit-learn == Launching a Dask cluster on UHHPC == The following example will run 4 cores on a single node on the main queue with the default 1GB RAM per core. Launch <tt>ipython</tt> on the headnode to interactively start a dask-cluster. Copy the following commands to your terminal. <pre> import dask from dask_jobqueue import PBSCluster </pre> The cores and memory values for the call to <tt>PBSCluster</tt> do not reflect the resources you will get on your job, they are set as the maximum available on the type of machine you are using. Check the [[architecture]] page to get details of the machines in each queue. If <tt>resource_spec</tt> isn't set to the desired amount the dask-workers will default to 1 core on 1 node with 1GB RAM. The default <tt>walltime</tt> is 30 minutes. <pre> cluster = PBSCluster(cores=32, memory='190GB', queue='main', resource_spec='nodes=1:ppn=4', walltime='01:00:00') </pre> There are further options that can refine the <tt>qsub</tt> script submitted to job scheduler. A more advanced user could set up a jobqueue.yaml configuration file in: ~/.config/dask Calling cluster.scale() will allow you to set up workers. Here we start a single worker: <pre> cluster.scale(1) </pre> We will also connect to the Dask Distributed client. <pre> from dask.distributed import Client client = Client(cluster) </pre> Calling client from the terminal will show you the current number of active workers, threads, and memory. <pre> client </pre> You can check the status of your jobs from within the ipython interface using: <pre> !qstat -at -u <username> </pre> You can view the job script submitted by Dask to the scheduler by using: <pre> print(cluster.job_script()) </pre> == Connecting to a dashboard == To open the dashboard you will need to create an ssh-tunnel from your local machine to UHHPC. The default port for Dask is 8787, but this may be different if the port is active. Dask will warn you if 8787 is in use and choose another port. Open a terminal (UNIX) or PowerShell (Windows) on your local machine. Note which headnode (1-3) you are running <tt>ipython</tt> on and replace X in the command below. This will set up an ssh-tunnel as a background process. <pre> ssh -N -f -L localhost:8787:headnode<X>:<port> <username>@uhhpc.herts.ac.uk </pre> You should now be able to access the dashboard through a browser on your local machine by connecting to: localhost:8787/status If you need to find and kill any background ssh-tunnels, you can use <tt>ps aux</tt> on a UNIX system to get the PID. <pre> ps aux | grep -E 'PID|ssh' kill <PID> </pre> Alternatively, from Windows PowerShell you can use <tt>ps</tt>. <pre> ps | findstr -i ssh stop-process -id <PID> </pre> 021ad2eb00a5da2c0182104fbd5e7743be6b678a Julia 0 94 755 2024-07-09T09:01:10Z Jmcgarry 18 Created page with "The stable release Julia-1.10.4 is installed on the headnode and is accessible through: <pre> module load julia-1.10.4 </pre> == Packages == There are no site packages for J..." wikitext text/x-wiki The stable release Julia-1.10.4 is installed on the headnode and is accessible through: <pre> module load julia-1.10.4 </pre> == Packages == There are no site packages for Julia. Users can use Pkg which will install desired packages in <tt>.julia</tt> in their home directory. Packages automatically precompile on install, but will need to do so again if there are any changes to the system that affect the package (e.g. moving between the main queue and the rocky-test queue). == Pluto == Pluto.jl is a notebook work environment for Julia with an in-built package manager. It can be installed using: <pre> Pkg.add("Pluto") </pre> To start a Pluto notebook server, from the Julia terminal run the command below. The default port for Pluto notebook servers in 1234, but you can specify a desired port. <pre> import Pluto; Pluto.run(host="0.0.0.0", port=1234) </pre> On a terminal on your local machine, set up an ssh-tunnel to the port. Make sure to use the correct node number/name. <pre> ssh -N -f -L localhost:1234:node<number>:1234 <username>@uhhpc.herts.ac.uk </pre> Then paste the access link (e.g. http://0.0.0.0:1234/?secret=<code>) into your local browser. f1b02cbb09bfa42a174afcd110185b2f4ca84236 756 755 2024-07-09T09:01:57Z Jmcgarry 18 /* Pluto */ wikitext text/x-wiki The stable release Julia-1.10.4 is installed on the headnode and is accessible through: <pre> module load julia-1.10.4 </pre> == Packages == There are no site packages for Julia. Users can use Pkg which will install desired packages in <tt>.julia</tt> in their home directory. Packages automatically precompile on install, but will need to do so again if there are any changes to the system that affect the package (e.g. moving between the main queue and the rocky-test queue). == Pluto == Pluto.jl is a notebook work environment for Julia with an in-built package manager. It can be installed using: <pre> Pkg.add("Pluto") </pre> To start a Pluto notebook server, from the Julia terminal run the command below. The default port for Pluto notebook servers is 1234, but you can specify a desired port. <pre> import Pluto; Pluto.run(host="0.0.0.0", port=1234) </pre> On a terminal on your local machine, set up an ssh-tunnel to the port. Make sure to use the correct node number/name. <pre> ssh -N -f -L localhost:1234:node<number>:1234 <username>@uhhpc.herts.ac.uk </pre> Then paste the access link (e.g. http://0.0.0.0:1234/?secret=<code>) into your local browser. dee2fc04feced5dcad8dc85d4a24440c4709acd8 757 756 2024-07-09T09:04:41Z Jmcgarry 18 /* Pluto */ wikitext text/x-wiki The stable release Julia-1.10.4 is installed on the headnode and is accessible through: <pre> module load julia-1.10.4 </pre> == Packages == There are no site packages for Julia. Users can use Pkg which will install desired packages in <tt>.julia</tt> in their home directory. Packages automatically precompile on install, but will need to do so again if there are any changes to the system that affect the package (e.g. moving between the main queue and the rocky-test queue). == Pluto == Pluto.jl is a notebook work environment for Julia with an in-built package manager. It can be installed using: <pre> Pkg.add("Pluto") </pre> To start a Pluto notebook server, from the Julia terminal run the command below. The default port for Pluto notebook servers is 1234, but you can specify a desired port. <pre> import Pluto; Pluto.run(host="0.0.0.0", port=1234) </pre> On a terminal on your local machine, set up an ssh-tunnel to the port. Make sure to use the correct node number/name. <pre> ssh -N -f -L localhost:1234:node<number>:1234 <username>@uhhpc.herts.ac.uk </pre> Then paste the access link (e.g. ht<span>tp://</span>0.0.0.0:1234/?secret=<string>) into your local browser. 519163544c6bd60836d572505c143cc9daa073b0 Queues 0 15 758 710 2024-07-11T14:31:04Z Jmcgarry 18 wikitext text/x-wiki There are five possible job queues available for general use on the system: * 'main' is the default queue: this submits to the main cluster. The maximum wall time on this queue is 1 week. * 'gpu' submits to the GPU nodes. The maximum wall time on this queue is 48 hours. * 'large' submits to 64 core nodes. The maximum wall time on this queue is 1 week, but you may require permission if your job requires a high number of nodes. * 'test' submits to 96 core nodes. Currently, access is limited to those requiring a high number of CPUs; if you don't already have access please contact a member of the team to discuss your needs. The maximum wall time is 1 week. * 'rocky-test' submits to 96 core nodes. This queue is a testing ground for all users to check the new OS setup meets their requirements. The maximum wall time is 1 week. * 'rocky-gpu' submits to A100 GPU nodes. This queue is a testing ground for all users to check the new OS setup meets their requirements. The maximum wall time is 96 hours. * 'forecast' submits to the dedicated air quality forecast nodes. == Default wall times == The default wall time for all the queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. 472ec81e4c73bcd7b71fbeb0ed31c43765fb3515 759 758 2024-07-11T14:31:22Z Jmcgarry 18 wikitext text/x-wiki There are seven possible job queues available for general use on the system: * 'main' is the default queue: this submits to the main cluster. The maximum wall time on this queue is 1 week. * 'gpu' submits to the GPU nodes. The maximum wall time on this queue is 48 hours. * 'large' submits to 64 core nodes. The maximum wall time on this queue is 1 week, but you may require permission if your job requires a high number of nodes. * 'test' submits to 96 core nodes. Currently, access is limited to those requiring a high number of CPUs; if you don't already have access please contact a member of the team to discuss your needs. The maximum wall time is 1 week. * 'rocky-test' submits to 96 core nodes. This queue is a testing ground for all users to check the new OS setup meets their requirements. The maximum wall time is 1 week. * 'rocky-gpu' submits to A100 GPU nodes. This queue is a testing ground for all users to check the new OS setup meets their requirements. The maximum wall time is 96 hours. * 'forecast' submits to the dedicated air quality forecast nodes. == Default wall times == The default wall time for all the queues is 24 hours. If you want a job to run for longer, you must explicitly provide a wall time estimate. == Other defaults == The default memory use per process for the main queue is 1 Gb. If you need more memory than this, see the section on [[memory]]. The default number of nodes for a job on all queues is 1. d629c7b5723cd58f4217917d395d75fde92431b5 Jupyter notebooks 0 95 763 2024-09-20T11:25:07Z Mjh 2 Created page with "To run a Jupyer notebook on a node: * Start an interactive job on the node with qsub * Run `jupyter notebook --ServerApp.ip=0.0.0.0 --no-browser`. You should see messages rep..." wikitext text/x-wiki To run a Jupyer notebook on a node: * Start an interactive job on the node with qsub * Run `jupyter notebook --ServerApp.ip=0.0.0.0 --no-browser`. You should see messages reporting the IP address and port being used. By default this is 8888 * On your local machine, do `ssh -N -f -L localhost:8888:nodeXXX:8888 username@uhhpc.herts.ac.uk` where 8888 is the port, nodeXXX is the node you're running on, username is your uhhpc username. * Now you can paste the URL given by Jupyter, e.g. `http://127.0.0.1:8888/tree?token=3bcef9f0c9d3a99dc0e82fd732cc4bd9b6797fd19132b4db` into your local web browser. (Important to use the one containing `127.0.0.1`. d8bd591c79dd10dcedbdd08a79b8d9cbbacd46bf 764 763 2024-09-20T11:25:26Z Mjh 2 wikitext text/x-wiki To run a Jupyer notebook on a node: * Start an interactive job on the node with qsub * Run `jupyter notebook --ServerApp.ip=0.0.0.0 --no-browser`. You should see messages reporting the IP address and port being used. By default this is 8888 * On your local machine, do `ssh -N -f -L localhost:8888:nodeXXX:8888 username@uhhpc.herts.ac.uk` where 8888 is the port, nodeXXX is the node you're running on, username is your uhhpc username. * Now you can paste the URL given by Jupyter, e.g. `http://127.0.0.1:8888/tree?token=3bcef9f0c9d3a99dc0e82fd732cc4bd9b6797fd19132b4db` into your local web browser. (Important to use the one containing `127.0.0.1`). aebfc9eb169faa1132d7a46f94e5b28beb4f9cac 765 764 2024-09-20T11:27:09Z Mjh 2 wikitext text/x-wiki To run a Jupyer notebook on a node: * Start an interactive job on the node with qsub * Run <tt>jupyter notebook --ServerApp.ip=0.0.0.0 --no-browser</tt>. You should see messages reporting the IP address and port being used. By default this is 8888 * On your local machine, do <tt>ssh -N -f -L localhost:8888:nodeXXX:8888 username@uhhpc.herts.ac.uk</tt> where 8888 is the port, nodeXXX is the node you're running on, username is your uhhpc username. * Now you can paste the URL given by Jupyter, e.g. <tt>http://127.0.0.1:8888/tree?token=3bcef9f0c9d3a99dc0e82fd732cc4bd9b6797fd19132b4db</tt> into your local web browser. (Important to use the one containing `127.0.0.1`). * The notebook that appears in your browser is running on the node you were allocated. * Be aware that when your job terminates the notebook server will disappear! 70d56ac0dd9aad3b7b997de75ffce6311a0e4c79 User:Cstopford 2 96 766 2024-11-13T09:59:01Z Mjh 2 Creating user page for new user. wikitext text/x-wiki Head of Particle Instruments and Diagnostics research group 4dcfa90a267c5e6d536118b6ce95e8a5e3107b82 User talk:Cstopford 3 97 767 2024-11-13T09:59:01Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 09:59, 13 November 2024 (UTC) 66b5e1a9e2fa375edf0112c5a43703b9d7c3eca7 Main Page 0 1 768 725 2024-11-13T09:59:48Z Mjh 2 /* Welcome to the UHHPC documentation wiki */ wikitext text/x-wiki == Welcome to the UHHPC documentation wiki == This wiki is the location for documentation for the UH HPC service. If you are a cluster user, feel free to register for an account on this Wiki so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Terms of use]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[GPUs]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Start here]] * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] * [[Galaxy|How to use Galaxy on the cluster]] == Known problems == * [[Known problems]] dbcaadb3f5f1791910d1ee09ad84b2290c9d3cde 774 768 2025-02-21T13:27:19Z Jmcgarry 18 /* Welcome to the UHHPC documentation wiki */ wikitext text/x-wiki == Welcome to the UHHPC documentation wiki == This wiki is the location for documentation for the UH HPC service. <span style="color:red"> This is not where you login to the cluster. Please see the [[Access]] page. </span> If you are a cluster user, feel free to register for an account on this Wiki so that you can describe any method you use to achieve a particular result on the cluster. You must be approved for an account before editing the Wiki. Some local documentation on [[MediaWiki]] is available. == Getting started == * [[Read this first]] == Cluster basics == * [[Accounts]] and [[Account cancellation policy]] * [[Terms of use]] * [[Policies]] * [[Fair share]] * [[Access]] * [[Architecture]] * [[Networking]] * [[Storage]]; [[quota]] system * [[Administrators]]' contact details == Using the cluster == * [[Jobs]] * [[Queues]] * [[Reservations]] * [[SMP machines]] * [[GPUs]] * [[Modules]] * [[MPI]] * [[OpenMP]] * [[Parallelization|How to parallelize your job]] * [[Compilers]] * [[Software]] * [[Mail]] * [[Web server]] * [[Monitoring]] * [[LOFAR-UK Compute Facility]] == Troubleshooting == * [[Start here]] * [[Why doesn't my job run?]] * [[Job errors]] == Publications == * [[Acknowledgements]] * [[Cluster bibliography]] == How-Tos == * [[Star-CCM+|How to use Star-CCM+ on the cluster]] * [[Galaxy|How to use Galaxy on the cluster]] == Known problems == * [[Known problems]] 74ba9f9078851d63b8f99ee3329c328c73d8add7 Matlab 0 88 769 698 2025-01-10T12:42:51Z Mjh 2 wikitext text/x-wiki At the present time, the current working version of MATLAB on the cluster can be accessed using /soft/MATLAB/R2024a/bin/matlab . MATLAB on the cluster works best when scripted but if you really need a GUI you can set up X forwarding, though for everyday computation there is little improvement in using the cluster over desktop PCs on campus. The only exception would be for heavy computation and simulations, which will likely be slowed down considerably by X forwarding. If you need a different version for a specific reason please contact us. 9f0334ac706c556ca2039af49e904d7948dca36d User:Jess 2 98 770 2025-01-16T09:32:42Z Mjh 2 Creating user page for new user. wikitext text/x-wiki UHHPC Cluster IT Research Cluster Assisstant 6540ff4c2ebbdc6d5880253f3ee308ac737c3e21 User talk:Jess 3 99 771 2025-01-16T09:32:42Z Mjh 2 Welcome! wikitext text/x-wiki '''Welcome to ''Clusterwiki''!''' We hope you will contribute much and well. You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages]. Again, welcome and have fun! [[User:Mjh|Mjh]] ([[User talk:Mjh|talk]]) 09:32, 16 January 2025 (UTC) 1e36a8f846711f1115d5f4d512db75f78e3ad08a Python packages 0 49 772 685 2025-01-20T13:50:54Z Mjh 2 wikitext text/x-wiki Local python packages installed in <tt>/soft</tt> include * numpy * scipy * astropy * tensorflow * h5py * mpi4py Set your PYTHONPATH to <tt>/soft/python3/usr/local/lib64/python3.9/site-packages</tt> to make these and many more available. You can install your own local copies of packages that don't exist on the system using <tt>pip install --user PACKAGENAME</tt>. These will appear in your <tt>.local</tt> directory. However, please don't install local copies of software that is already globally available! If you need an upgraded version, talk to the system [[administrators]]. == Python virtual environments == Sometimes it is necessary to run Python in a clean environment -- particularly if you want Python3 to be the default Python. To set this up do the following (Python3 and bash assumed -- for tcsh use unsetenv not unset): <pre> unset PYTHONPATH python3 -m venv py3-venv source py3-venv/bin/activate pip3 install --upgrade pip </pre> This creates a directory py3-venv which will be used for your own private installation of packages. In future whenever you do <pre> source py3-venv/bin/activate </pre> your python version will be python3 and you will have the ability to install python3 packages to your py3-venv directory using pip (without <tt>--user</tt> option). For example, to get set up with Tensorflow with GPU support: <pre> source py3-venv/bin/activate pip install tensorflow module load cuda-11.4 ipython3 </pre> If all goes well then you will be able to do <pre> import tensorflow as tf tf.print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) </pre> Note that using this method means you will also have to install any other modules you need, such as matplotlib, using pip install. 2a7036d3a61602691a82cf28fd3a3252ea526377 Access 0 5 773 506 2025-02-21T13:13:23Z Jmcgarry 18 /* Access */ wikitext text/x-wiki == Access == The [[architecture|head node]]s of the cluster are accessible by ssh to uhhpc.herts.ac.uk, once you have an [[accounts|account]] set up. If you are working from a Unix desktop, you should be able to type <tt>ssh username@uhhpc.herts.ac.uk</tt>. If you are using Windows, this can be done from PowerShell. You may prefer to use an ssh client such as: * PuTTY[http://www.chiark.greenend.org.uk/~sgtatham/putty/] * X2Go[https://wiki.x2go.org/doku.php/download%3Astart] * MobaXterm[https://mobaxterm.mobatek.net/] Unless specific authorization from the [[administrators]] is provided to the contrary, individual compute nodes must be accessed either through batch [[jobs]] or via [[interactive jobs]] run on the head nodes: see also the [[policies|policy]] relating to this. You may not log in to compute nodes directly, or run code on the head nodes. 64b0e0884cd312c981e7038a75189e8dbd8b12bc